Transcribing Piano Rolls, the Pythonic Way

eliteraspberrie · on April 12, 2014

The faster way of doing this:

    def fourier_transform(signal, period, tt):
        """ See http://en.wikipedia.org/wiki/Fourier_transform
        How come Numpy and Scipy don't implement this ??? """
        f = lambda func : (signal*func(2*pi*tt/period)).sum()
        return f(cos)+ 1j*f(sin)

is using the FFT.

What you want is the power spectral density in the discrete case, called the power spectrum. It can be calculated by multiplying the discrete Fourier transform (FFT) with its conjugate, and shifting. NumPy can do it. Here is an example: http://stackoverflow.com/questions/15382076/plotting-power-s...

zulko · on April 12, 2014

I knew I was going to have this remark :) Now correct me if I am wrong, but I think the FFT (which computes the discrete Fourier transform) cannot replace the continous fourier transform in my case, because the optimal periods I find are non-integer values. In the first case, the holes are separated by 7.5 pixels. The FFT could only have told me that they are separated by 7 or 8 pixels, which is not precise enough. Same thing for the tempo, a beat corresponds to 7.1 frames of the video, and a FFT would have told me 7.

If someone knows a way to use the FFT to get non-integer periods (apart from oversampling the signal) I'll gladly change the code.

peterwoo · on April 12, 2014

The maximum frequency you can detect is limited by your sampling rate, but there's not a limit on the precision with which you can break those frequencies up.

It's controlled by a parameter NFFT -- the PSD will compute (NFFT/2+1) values evenly spaced between 0 and the Nyquist frequency.

So say the frame rate is 15Hz and you compute with NFFT=2048, then PSD[970] contains the amplitude at 7.09Hz.

This was a really cool project by the way!

evntdrvn · on April 12, 2014

Also, it's not as widely known as the FFT, but if you know roughly the frequency of interest you can use the Goertzel algorithm to calculate a chosen number of bins around that specific freq and then pick the max of them to find the freq of interest, instead of when using the FFT having to calculate a bunch of bins using a large nFFT in order to get enough freq resolution and then discarding 99% of the results. Going further, compared to the original Goertzel, the Generalized Goertzel algorithm does the same thing but allows you to query non-integer multiples of the fundamental frequency: http://asp.eurasipjournals.com/content/2012/1/56

zulko · on April 12, 2014

Thanks, I learned something. I will try it and amend the blog when I have time.

evntdrvn · on April 12, 2014

Forgot to say, great post! :)

GFK_of_xmaspast · on April 13, 2014

There are a lot of parametric (as opposed to the nonparametric FFT) methods for tracking frequency, I'm not totally convinced they're applicable to this case, but I think they might be fun to try out. Maybe start here: http://en.wikipedia.org/wiki/Multiple_signal_classification

rfleck · on April 12, 2014

See a master at work making original rolls at QRS. http://www.youtube.com/watch?v=i3FTaGwfXPM

If was a fun place to see in the 70's after watching my father rebuild our player piano.

TazeTSchnitzel · on April 12, 2014

Interesting that they used computers to make them. It seems obvious in hindsight; player piano music is digital!

userbinator · on April 12, 2014

Also interesting that we had digital data storage, in the form of punched cards and tape, decades before digital computers.

vajrabum · on April 12, 2014

Longer than that. The Jaquard loom was invented in 1801 and the player piano was first demonstrated in 1876.

chillingeffect · on April 12, 2014

excellent to see the Apple //e [1] http://youtu.be/i3FTaGwfXPM?t=4m28s

msvan · on April 12, 2014

What a fascinating convergence of math, music and Python. Many people I meet who don't specialize in math but have taken university-level courses in it seem to remember the Fourier transform as a highlight, probably because of its many applications.

kbd · on April 11, 2014

I love the abundance of Python. For those unaware, even the youtube-dl command line utility he used to download the video is written in Python.

w1ntermute · on April 12, 2014

And in contrast to what its name suggests, youtube-dl supports 150+ different services: http://rg3.github.io/youtube-dl/supportedsites.html

misiti3780 · on April 12, 2014

I was thinking the same thing - I had never heard of that tool but I will def. use it in the future

stevetjoa · on April 12, 2014

Very cool!

Relevant: Zenph makes "re-performances" of old piano recordings. They take a recording, do music transcription magic to get the exact timings and velocities of each note event, and then feed that into a player piano. So it's as if you are listening to the ghost of Rachmaninov sitting at the piano, as shown here: https://www.youtube.com/watch?v=eevzbV6Hkkk&t=28 (music starts at 0:28)

(I just visited http://zenph.com for the first time in about a year, and it appears that they've pivoted into a music education company.)

nanidin · on April 11, 2014

Interesting question - is the author's transcription a derivative work of the video? And if so, is he actually allowed to release his transcription into the public domain (without the permission of the author of the video)?

shakethemonkey · on April 11, 2014

No, it's only derivative in the sense of process. The video lacks originality; for the musical notes it is merely a mechanical reproduction of the punched holes. Similarly, a photograph of a public domain painting is also in the public domain. See: Bridgeman Art Library v. Corel Corp., 36 F. Supp. 2d 191 (S.D.N.Y. 1999). At least this is the law in the United States, which is sensible; absurdity of other jurisdictions may vary.

nanidin · on April 12, 2014

It's nice to know our system accounts for cases like this. Thanks for the detailed info!

ntoshev · on April 12, 2014

What if you tried to transcribe the music solely from Fourier transform of the audio source? I expect the piano has an abundance of harmonics, but there should be some way to distinguish them from the keys. Hasn't someone done it already?

gtani · on April 12, 2014

i've seen NNLS/chroma referenced in a few places, like the chordify papers:

http://isophonics.net/nnls-chroma

Here's chordify: http://ismir2012.ismir.net/event/papers/295_ISMIR_2012.pdf

That conference has great references but unfortunately hasn't been repeated since 2012 http://www.ismir.net/proceedings/index.php

raverbashing · on April 12, 2014

It's certainly hard

However, this case would be one of the best cases for it, it's a single instrument, and you could make a careful recording out of it

selmnoo · on April 12, 2014

That was a lovely read, thank you so much for writing and sharing it.

elwell · on April 11, 2014

Really fantastic hack. Now try transcribing with just the audio track.

anigbrowl · on April 11, 2014

That's a hard problem. If you have some material like that with a clear recording, the only good commercial solution that I know of is Melodyne, and he's not saying how he does it. In theory you just look for multiple peaks in the FFT, but this is much easier said than done.

d_loemax · on April 12, 2014

i built a plogue bidule patch before melodyne rolled out "dna" and it is extremely difficult to get the optimal fft parameters to get an accurate conversion. i cant imagine an algorithm that would get it right from analyzing the sample would be any less difficult. ableton's and cubase's options are pretty rough too. i am a drummer though, i am just trying to make up for my ears.

bede · on April 12, 2014

My favourite blog post of 2014. Thank you for sharing.

analog31 · on April 12, 2014

I think this is a nice solution because it takes care of the hardware side of things by making use of a garden variety video camera.

stavros · on April 12, 2014

This is beautiful, it's one good idea after another, good job!

peapicker · on April 11, 2014

This is really nice, thanks for sharing it with us.

cdelsolar · on April 13, 2014

So, so cool. I love posts like this.

evidencepi · on April 12, 2014

Nice post, thanks for sharing!

smortaz · on April 11, 2014

fantastic. with your permission, i'd love to use this to demo python!

zulko · on April 13, 2014

Yeah, sure.