Extracting Audio from Pictures of Gramophone Recordings

guylhem · on April 5, 2013

Well that's a good hack!! that's why I read HN, not for some Arrington story.

I wouldn't have guessed there was enough data left in the picture of the gramophone track to recreate audio from it.

Which makes me think about Shannon theorem - my guesses would have been wrong, so I wonder just "how wrong".

What would the minimal resolution (dpi) be to store say 1 minute of audio at 8Hz? Would there be an easy way to calculate that (just a back of the envelope calculation, to get a rough idea)

What was the resolution of the printing press ca 1880, to know how much audio could have been "stored" in a single page for posterity? In the picture there is a lot of text, so obviously a lot of lost space.

And I wonder how we could make 2d barcodes to store uncompressed wav audio, which might be more resistant to noise, but only because it'll be like a "lower resolution", ie less easily disruptible. Which leads me back to Shannon.

I'm not into audio processing but that raises a lot of questions I'll be pondering over with some coffee tonight (obligatory xkcd reference- http://xkcd.com/356/)

That's the kind of article I love :-)

EDIT: downvotes, whatever. Haters gonna hate. I'll have some hours of fun thanks to this article anyway!

klodolph · on April 5, 2013

Twibright Optar stores 200kB per A4 sheet, or 25 seconds at 64kbits/s. 64kbits/s is high quality stereo audio if compressed using Opus. Most music sounds great at this bit rate. 64kbits/s is also 8kHz µ-law PCM, which is the standard for telephone lines in North America--so, terrible but intelligible. Opus can be turned down to 6kbits/s, which gives you about 4.4 minutes of muddy but intelligible speech per page.

By extrapolation, we can calculate density for microfiche. Assuming a resolution of 120 lp/mm, microfiche has about 5 times the resolution of the paper used for Optar, and the dimensions are exactly half. This means that microfiche has 6.3 times the density of A4 paper, measured in bytes per sheet. This gives us one high quality, durable recording of "Ring of Fire" on a single sheet of microfiche.

I wouldn't bother with uncompressed PCM. (You use the term "wav audio", but WAV is a file format.)

InclinedPlane · on April 6, 2013

Consider that the outer rim of a 45 rpm vinyl record moves at only about 16.5 inches per second. This means that the effective sampling frequency is equal to 16.5 * dpi, so even at a relatively modest (for film anyway) 300 dpi you'll get 5,000 Hz sampling, enough to reproduce many sounds moderately well.

kyrias · on April 5, 2013

  > "I wouldn't have guessed there was enough data left in the picture of the
  > gramophone track to recreate audio from it."

Ah, well the thing is that the needle isn't straight, it has an angle i.e. as the tracks get deeper they also get wider, so you only need to be able to see how wide the track is. http://www.vinylrecorder.com/stereo.html

w1ntermute · on April 6, 2013

> not for some Arrington story

For those missing the reference: https://news.ycombinator.com/item?id=5501832

caseysoftware · on April 6, 2013

"I wouldn't have guessed there was enough data left in the picture of the gramophone track to recreate audio from it."

I'd bet it's almost the other way around.. the information density would be so low because the information itself is relatively low resolution* that it makes the analysis that much easier.

* "Resolution" in the detail sense of the word, not the visual.

auctiontheory · on April 5, 2013

Mind-blowing.

It does give me pause to wonder how much of the movie of our day-to-day lives future generations will be able to replay, via goodness knows what mechanism. (DNA testing of blood and hair is another example of this theme.)

tokenadult · on April 6, 2013

Following along while listening by looking at the linked German text of "Der Handschuh"

http://www.has.vcu.edu/for/schiller/handschuh_dual.html

helps immensely in sorting out the speech from the noise (at least for me, with my beginner's knowledge of German). It's amazing to hear the inventor's poetry reading diction from so long ago.

lcrs · on April 6, 2013

related, render an audio file as an photoreal image of a vinyl record, then recover the audio from that image: http://www.renderman.org/RMR/Examples/srt2011/audiblyPlausib...

dmacedo · on April 6, 2013

This looks interesting

wazoox · on April 6, 2013

Someone at the French National Audiovisual Archive ( http://www.ina.fr/ ) showed me that they have been doing about the same thing for years to get the sound back from heaps of broken records of various origins, mostly radio recording from the 30s and 40s -- nothing nearly as old as this. IIRC they even almost completely automated the process, including matching the groove across missing bits in the high resolution picture from the scanner.

kybernetikos · on April 5, 2013

Reminds me of 'Digital Needle' http://www.phys.huji.ac.il/~springer/DigitalNeedle/

_pferreir_ · on April 6, 2013

Great work!

It reminds me of an X-files episode where Mulder and Scully find a first century ceramic cup that supposedly contains a "phonograph" recording of Jesus Christ's words during the last supper.

ck2 · on April 6, 2013

Certainly that circular whumping noise can be digitally removed to allow the voice to be heard more clearly?

It seems to be a precisely repeating pattern, prime for filtering?

petsounds · on April 5, 2013

Sadly, it seems that all of the audio samples are dead

SeanDav · on April 5, 2013

worked for me, just took a long time to load the first time. Quality is really lousy , though still a minor technological miracle.

teeja · on April 6, 2013

Works for Berliner-style lateral-cut grooves, wouldn't work at all with vertical-cut records like Edison's!

justincormack · on April 5, 2013

Reminds me of Godel, Escher, Bach.

keithpeter · on April 6, 2013

How? Serious question, I'm not challenging!

This work reminds me of what 'analogue' means. There is no coding of the sound, each stage in the analogue process represents the varying intensities of the various pitches as some variation in a property of the storage medium.

Any form of digital representation (CD, mp3 whatever) requires a convention to interpret the raw stream of numbers. If you don't know the convention then you have a string of numbers.

klodolph · on April 6, 2013

Analogue signals do have encodings and conventions, and the information is incorrect or possibly mangled unless you decode in the same way.

So you're recording on a record. How fast does the record spin? Do you want a constant angular velocity or constant linear velocity? At what angle does the stylus move? Is there equalization that must be removed afterwards? Is there some kind of modulation?

The convention for LP is 33⅓ RPM constant angular velocity, with two channels each cut at 45° from vertical, the groove moving from the outside in, and the RIAA equalization curve. If you don't know that the audio modulates the groove position, then you just have an oversized coaster.

Digital conventions are just a little more demanding. However, PCM is quite simple and you can dig it out of unknown file formats with relative ease.

keithpeter · on April 6, 2013

"So you're recording on a record. How fast does the record spin? Do you want a constant angular velocity or constant linear velocity? At what angle does the stylus move? Is there equalization that must be removed afterwards? Is there some kind of modulation?"

Fair points. A possible elaboration is that analogue encoding is discoverable from the medium. If we know that a recording is of a voice, then we can recover the amplitude signal and speed up/slow down until it sounds right or until the spectrum matches what we know of the source. Digital encoding is not discoverable from the medium in quite the same way...

klodolph · on April 7, 2013

Well, no, it's not that easy. You can't always just speed things up or slow them down.

For example, try to extract audio from a CD-4 / Quadradisc record. Two of the channels are modulated with a 30 kHz carrier using something called FM-PM-SSBFM, and they won't be audible unless you demodulate them properly. You won't get anything resembling music unless you know exactly what you're doing.

By comparison, I've recovered audio from undocumented proprietary PCM-based formats from video games made in the 90s. It was no problem.

So I think the distinction between analogue and digital is not as important as the distinction between simple formats (mono/stereo records, linear PCM) and complex formats (FM-PM-SSBFM, MP3).

justincormack · on April 6, 2013

There is a character who enjoys his vinyl records by hanging them on the wall. He says they have the same information as playing them.

boomlinde · on April 8, 2013

I guess this guy isn't a fan of b-sides.