Hacker News new | past | comments | ask | show | jobs | submit login

Authors here: Fun to wake up to this surprise! We are rushing to add GPUs so you can all experience the app in real-time. Will update asap



Awesome, there is another project out there that does it with CPU https://github.com/marcoppasini/musika maybe mix the both, ie take initial output of musika, convert to spectrogram and feed it to riffusion to get more variation...


"fine-tuned on images of spectrograms paired with text"

How many paired training images / text and what was the source of your training data? Just curious to know how much fine tuning was needed to get the results and what the breadth / scope of the images were in terms of original sources to train on to get sufficient musical diversity.


Fascinating stuff.

One of the samples had vocals. Could the approach be used to create solely vocals?

Could it be used for speech? If so, could the speech be directed or would it be random?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: