Hacker News new | past | comments | ask | show | jobs | submit login

For anyone interested I've transcribed this song [1] using the replicate link the author provided (Colab throws errors for me) using mode music-piano-v2. It spits out mp3s there instead of midis so you can hear how it did [2]

[1] https://www.youtube.com/watch?v=h-eEZGun2PM [2] https://replicate.com/p/qr4lfzsqafc3rbprwmvg2cw5ve




Awesome, thanks for running the test!

I can't help but feel it is heavily impacted by ambience of the recording as well. The midi is of course a very rigid and literal interpretation of what the model is hearing as pitches over time, but of course it lacks the subtlety of realizing a pitch is sustaining because of an ambient effect, or that the attach is is actually a little bit before the beginning of the pitch, etc.

If it could be enhanced to consider such things, I bet you would get much cleaner, more machine-like midis, which are generally preferable.


Listening to the input and output, the reproduction is like comparing a cat to a picture that resembles a cat, but isn't a cat.


Music's a lot more than a collection of notes ... and the timbre of one piano is about as far away from a mixture of reeds and dulcit electronics as you can get. (The Fulero is very nice.)


How do you get a midi?


You'd have to run it yourself. There's a docker image available but it's a pretty big download (11.7GB)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: