1
6 Comments

Where is the Dalle2 for music🎶? (8 reasons why it is not that simple)

I have loved learning about generative AI. I use Dalle2, Stable Diffusion, and Midjourney a lot.

It got me thinking, where is the Dalle2 for music? Put prompt in a text box and get music back?

Did a lot of research including getting a lot of links from Indiehackers and wrote this.

https://mythicalai.substack.com/p/where-is-dalle2-for-creating-music

Short story, music is harder than images for 8 reasons.

  1. Lack of data
  2. Lawyers
  3. Music is more like video, than it is a still image
  4. Music takes longer to consume
  5. Digital instruments are worse than real world instruments
  6. We are most strict in our evaluation of music
  7. Music is more subjective
  8. Lyrics are an additional difficulty layer

While there are issue, I have no doubt that we will see a text to music generation model in the next few years.

There are also fun music generation tools you can play with right now. Full list in the article.

But what do you think? Will we get a music generation AI, or is music too hard for computers?

on November 12, 2022
  1. 2

    If the output is a wav or mp3 file-- IE, a fully mixed and produced track, that must be terribly difficult and nuanced. But if it's just, say, sheet music, it should be a lot easier for an AI to generate. From there, a synth on the local machine could render it out with MIDI.

    Has anyone tried that or similar?

    1. 2

      Yes. There are a few projects whose output was MIDI. I will post a full list of all the projects I found tomorrow.

  2. 1

    What about for food? I love thinking about the cool use cases of AI outside of just image generation which seems to be the popular trend.

    Last week, I decided to launch my own AI tool in the food space.
    Use GPT-3 to find a recipe for exactly what you want to cook - sentientplatter.com

    1. 1

      Nice! Tried making a recipe, just errored out. :(

      1. 1

        Oh dang... I'm using vercel to generate pages once a new prompt is typed in. It only gives 10secs which is sometimes too long for a GPT-3 response. Working on splitting out some of the function calls, but if you retry after the error, you may see the result!

        Thank you for the feedback and for trying!

Trending on Indie Hackers
I built a text-to-video AI in 30 days. User Avatar 64 comments What 300 Builders Taught Us at BTS About the Future of App Building User Avatar 52 comments I built something that helps founders turn user clicks into real change 🌱✨ User Avatar 49 comments From a personal problem to a $1K MRR SaaS tool User Avatar 30 comments You don't need to write the same thing again User Avatar 29 comments How An Accident Turned Into A Product We’re Launching Today User Avatar 28 comments