def fourier_transform(signal, period, tt):
""" See http://en.wikipedia.org/wiki/Fourier_transform
How come Numpy and Scipy don't implement this ??? """
f = lambda func : (signal*func(2*pi*tt/period)).sum()
return f(cos)+ 1j*f(sin)
is using the FFT.
What you want is the power spectral density in the discrete case, called the power spectrum. It can be calculated by multiplying the discrete Fourier transform (FFT) with its conjugate, and shifting. NumPy can do it. Here is an example: http://stackoverflow.com/questions/15382076/plotting-power-s...
I knew I was going to have this remark :)
Now correct me if I am wrong, but I think the FFT (which computes the discrete Fourier transform) cannot replace the continous fourier transform in my case, because the optimal periods I find are non-integer values. In the first case, the holes are separated by 7.5 pixels. The FFT could only have told me that they are separated by 7 or 8 pixels, which is not precise enough. Same thing for the tempo, a beat corresponds to 7.1 frames of the video, and a FFT would have told me 7.
If someone knows a way to use the FFT to get non-integer periods (apart from oversampling the signal) I'll gladly change the code.
The maximum frequency you can detect is limited by your sampling rate, but there's not a limit on the precision with which you can break those frequencies up.
It's controlled by a parameter NFFT -- the PSD will compute (NFFT/2+1) values evenly spaced between 0 and the Nyquist frequency.
So say the frame rate is 15Hz and you compute with NFFT=2048, then PSD[970] contains the amplitude at 7.09Hz.
Also, it's not as widely known as the FFT, but if you know roughly the frequency of interest you can use the Goertzel algorithm to calculate a chosen number of bins around that specific freq and then pick the max of them to find the freq of interest, instead of when using the FFT having to calculate a bunch of bins using a large nFFT in order to get enough freq resolution and then discarding 99% of the results. Going further, compared to the original Goertzel, the Generalized Goertzel algorithm does the same thing but allows you to query non-integer multiples of the fundamental frequency: http://asp.eurasipjournals.com/content/2012/1/56
There are a lot of parametric (as opposed to the nonparametric FFT) methods for tracking frequency, I'm not totally convinced they're applicable to this case, but I think they might be fun to try out. Maybe start here: http://en.wikipedia.org/wiki/Multiple_signal_classification
What a fascinating convergence of math, music and Python. Many people I meet who don't specialize in math but have taken university-level courses in it seem to remember the Fourier transform as a highlight, probably because of its many applications.
Relevant: Zenph makes "re-performances" of old piano recordings. They take a recording, do music transcription magic to get the exact timings and velocities of each note event, and then feed that into a player piano. So it's as if you are listening to the ghost of Rachmaninov sitting at the piano, as shown here: https://www.youtube.com/watch?v=eevzbV6Hkkk&t=28 (music starts at 0:28)
(I just visited http://zenph.com for the first time in about a year, and it appears that they've pivoted into a music education company.)
Interesting question - is the author's transcription a derivative work of the video? And if so, is he actually allowed to release his transcription into the public domain (without the permission of the author of the video)?
No, it's only derivative in the sense of process. The video lacks originality; for the musical notes it is merely a mechanical reproduction of the punched holes. Similarly, a photograph of a public domain painting is also in the public domain. See: Bridgeman Art Library v. Corel Corp., 36 F. Supp. 2d 191 (S.D.N.Y. 1999). At least this is the law in the United States, which is sensible; absurdity of other jurisdictions may vary.
What if you tried to transcribe the music solely from Fourier transform of the audio source? I expect the piano has an abundance of harmonics, but there should be some way to distinguish them from the keys. Hasn't someone done it already?
That's a hard problem. If you have some material like that with a clear recording, the only good commercial solution that I know of is Melodyne, and he's not saying how he does it. In theory you just look for multiple peaks in the FFT, but this is much easier said than done.
i built a plogue bidule patch before melodyne rolled out "dna" and it is extremely difficult to get the optimal fft parameters to get an accurate conversion. i cant imagine an algorithm that would get it right from analyzing the sample would be any less difficult. ableton's and cubase's options are pretty rough too. i am a drummer though, i am just trying to make up for my ears.
What you want is the power spectral density in the discrete case, called the power spectrum. It can be calculated by multiplying the discrete Fourier transform (FFT) with its conjugate, and shifting. NumPy can do it. Here is an example: http://stackoverflow.com/questions/15382076/plotting-power-s...