Hacker News new | past | comments | ask | show | jobs | submit login
Did the Manhattan Transfer Use Auto-Tune? (cardinalpeak.com)
49 points by mr-howdy on Sept 2, 2010 | hide | past | favorite | 32 comments



Pitch is complicated…

As you can see, the plot shows that Siegel didn’t hit a perfect “A”—that would have been at 220 Hz. Instead, she’s at 216 Hz, which would be noticeably flat.

I think that song is in F (175 Hz)[1], the first note is an A, the 3rd of the scale. The A on an equal tempered piano would indeed be 220Hz, but in a perfect third is 5/4 times the frequency of the F, 217 Hz. I'd say she nailed a perfect third.

… insanely complicated.

[1] Sorry, on my way out the door, don't own a copy, can't listen to more than the first 8 notes.


What's worse, a 2048-point FFT is only accurate to 22Hz at a 44100Hz sample rate (1024 bins equally spaced across the entire frequency range from 0Hz to 22050Hz, each bin about 22Hz wide). The closest bins should be centered at 215.332Hz and 236.865Hz, so maybe Audacity is using some kind of interpolation/extrapolation to identify a peak at 216Hz, but I wouldn't trust it without a higher resolution. It's not hard to switch to 65536, but then you're taking more time data into account, and the note in question probably wasn't 1.5 seconds long.


You can look at how the phase changes over time to determine the dominant frequency within a bin to greater precision than what the FFT width would otherwise dictate. That's probably what they're doing.


Wow - that's fascinating.. sounds like you are saying that 'sometimes even a perfectly tuned piano is not hitting the 'right' pitch for a song' is that right? From a music theory perspective I'd love to know more.. Can you provide any links for an interested amateur to learn more?


Pianos are tuned to an equal-tempered scale-- essentially (and, of course, over-simply) you "spread out" the "out-of-tuneness" that results from the fact that the Western 12-tone chromatic scale doesn't fit inside a perfect system of frequency ratios. See here for lots of math that I don't understand at all:

http://en.wikipedia.org/wiki/Equal_temperament

As a side (historical) note, one of the earliest champions of Equal Temperament (or Well-Temperament) was Bach, who wrote The Well Tempered Clavier as a kind of advertisement for this tuning system. Prior to systematized equal temperament, it was impossible for an instrument to play in all 24 keys without being re-tuned. On a Well-Tempered instrument, you can play straight through all 24-keys without stopping to be retuned.

Singers (and brass players, and any other instrument that allows for fine, sinlge-cent level tuning) often use a different system called Just Intonation:

http://en.wikipedia.org/wiki/Just_intonation

Just Intonation is "more in tune" than Equal-temperament, but it is not noticeable by most people. I teach music for a living and have a degree in music theory and I have a very difficult time telling the difference.

EDIT: Yes, I did oversimplify by conflating equal temperament and well-temperament, but in the modern debate, they do tend to get lumped together.


As I understand it, "well temperament" is something of a blanket term for many systems that attempted to find a universal tuning. We don't really know how Bach's clavier was tuned for his notion of well temperament.


As noted, pitch is insanely complicated. My knowledge is incomplete, but I'll try and clear up some things.

First off, an A is not always the same as another A. Some time in the last century (wikipedia if you want the exact date, in fact, if you want further reading, just wikipedia some key phrases from this response) it was decided upon that concert A should be 440Hz - this, as I understand it, was largely in search of consistency between manufacturers. Before that, A had been set at a number of points, going down into the 40teens.

That's only the beginning of complications. As it turns out, even if you can get everyone to agree on a base frequency, different methods of calculating the proper frequency yield different results. Pythagorean Tuning uses whole number ratios to calculate the frequencies, starting with 3:2 (a perfect fifth). [Quick note: if you've never seen Donald Duck in Mathmagic Land, youtube it now] But this method has the unfortunate outcome of giving slightly different values for the diminished fifth and the augmented fourth - which are theoretically the same note.

To ancient musicians, this was like getting a wooden train set and putting the curved pieces together in sequence, only to find that they didn't form a circle. The proverbial last piece is known as the wolf interval*.

So people started to fudge it. This is called temperament. There's a bunch of different ways (mathematically and mechanically) to do it, but I don't really have the time or understanding to get into all of them so I'll skip forward to present day. If you walk into your local music shop and play a chromatic scale on a random keyboard, you'll likely hear what's called Equal Temperament. In short, divide your octave into equal parts (in the case of this example, 12) and you have your 12 notes to a chromatic scale.

And it's just that simple. Of course I skipped over tons, but I'm similarly on my way out the door, so deal with it. Or e-mail me if you really care.

If you're not a musician, producer, technician, or the like, [http://en.wikipedia.org/wiki/Music_and_mathematics] should be more than you'll ever need to know. In fact, I wish I had known about that page 10 years ago.

Not enough? This is the textbook I used in school on the topic. It's a little less historical and more sciencey, if that's your bag: http://www.amazon.com/Acoustical-Foundations-Music-John-Back...


Another complication to note is that pitch != frequency. Pitch is how we perceive the thing. The same frequency at different volumes can be perceived as different pitches. So you really can't ever find perfect without knowing some exact context.

See wikipedia entry on pitch perception for more examples:

  http://en.wikipedia.org/wiki/Pitch_(music)#Perception_of_pitch


It turns out that there is really no such thing as a "perfectly tuned piano". Something is always a little dissonant. Different tunings can move this dissonance around, but (on a standard 12-notes-per-octave keyboard) they cannot eliminate it.

Here is a nice article from Slate about all this:

http://www.slate.com/id/2250793/

(EDIT: GMTA, it seems.)


I'm a complete amateur but have come across different scales, different tuning intervals and such - http://www.midicode.com/tunings/temperament.shtml this gives an almost understandable rendition of what's happening.


For an in depth and also entertaining history of equal temperament (and some other tuning schemes) you should check out Temperament: How Music Became a Battleground for the Great Minds of Western Civilization by Stuart Isacoff.


I found this article quite interesting...

http://www.slate.com/id/2250793/pagenum/all

It covers the history of tuning systems and how many different definitions there are of 'in tune'.


Cool analysis here. I think it's a bit too superficial to serve as the last word, though.

If I were the sound engineer on this type of album, I definitely wouldn't worry about autotuning parts where only one vocalist is singing. With enough takes, professional singers can be pretty dead on.

More important are sections where you're trying to perfect harmonies, where tweaking a note that's a few cent sharp can make a big difference to the end effect. Unfortunately I don't have the skills to separate the vocal parts during harmony to check for this sort of thing.

edit: To my ears, that intro doesn't sound autotuned, although the reverb effect does feel artificial. http://www.amazon.com/Chick-Corea-Songbook-Manhattan-Transfe...


The author of this article makes some naïve assumptions about modern auto-tune systems, including but not limited to Antares Auto-Tune. His primary mistaken assumption is that an auto-tune system will snap a note perfectly into alignment with an ideal pitch. This is not true. A sophisticated automated system (like Antares Auto-Tune, in typical usage, not for the Cher effect) uses heuristics to find a more natural, imperfect pitch to use as an alignment.

And this is just for "fire and forget" auto-tune. If you work an auto-tune by hand – and most skilled engineers will do this, when it's called for – you can use your own judgment as a musician to figure out what to do. Antares Auto-Tune lets you do this by presenting a sort of X-Y plot of pitch over time which you can manipulate. Melodyne does something similar, with a little more sophistication (it actually pioneered this technique.)

Another one of his incorrect assumptions is that auto-tune still behaves like that patent from 1998. While some of the algorithms involved are the same, the systems of today [1] use lookahead on the audio source, so that the pitch correction isn't constantly playing 'catchup' (it takes at least a coupe of cycles of an oscillating wave to determine its pitch with reasonable accuracy.) So you would not see something start wrong and then deviate towards its ideal pitch – in fact, you might see it start too accurately.

The primary destruction auto-tune wreaks on vocal performances isn't just too-perfect pitch: it destroys the subtly and intonation that good performers can impart. A real vocalist doesn't hit the ideal pitch at the start of each note, because it's hard to make the human voice do that. They use the imperfection to their advantage. There's a lot of subtle complexity that auto-tune can just wipe out when you use it.

Of course, if the vocalist isn't that good, then there's not any real loss.

As for his article, even assuming the methods he used to analyze the songs were not based on a misconception, the resolution is too low. Perhaps surprisingly, the human ear is capable of distinguishing pitch much more finely than the graphs he plotted.

Edited addendum: here are a couple of screenshots of the actual Antares Auto-Tune software that will illustrate what I'm talking about.

Here is a picture of the Auto-Tune software set in fire-and-forget 'auto' mode. http://www.antarestech.com/images/ATEvo_Auto_mode.jpg

Notice the prominent 'Retune Speed' adjustment knob. This is how you used it in the late 90s and early 00s, or today when you want the robot or Cher effect.

Now here is a picture of Auto-Tune in use in Graphical mode, where you adjust the pitches directly. This is how most engineers today will use Auto-Tune when it's not for a deliberate, noticeable effect. http://www.antarestech.com/images/ATEvo_Graphic_full.jpg

Notice there is no retune speed adjustment knob. Instead, you edit the pitches directly in a graph.

[1] In a non-live-performance setting, anyway. It introduces latency, so if you used this live, it would add 'lag' to the performer's input. Bad.


The Paul McCartney live release last year Good Evening New York City was the first use of auto tune that really made feel like things were getting bad. I think it got a lot of press because it was Paul McCartney of course. I assume for live auto-tune to work like that, there must have been someone playing keyboards or something in unison with his vocal lines. I don't think auto-tune was post production (though that could have been the case).

It's probably most noticeable when you are so familiar with someone's voice.


Autotune doesn't have to know the sheet music behind what the singer is singing, it's just trying to snap the pitch to any in-tune note - it's the audio equivalent to "Snap to Grid".


Some older pitch correction devices, like the Lexicon PCM-81, aren't that great at snapping the pitch on their own (or can't do it at all), and need a keyboard playing the correct notes.


A particularly well-known example of live use is the Billy Joel Super Bowl XLI national anthem: http://www.youtube.com/watch?v=G8smRRyoYGc

The main problem with auto-tuning live is that you're just trading the pitch control for everything else - intonation, vibrato, etc. Exposing that "robot sound" ends up being another way to screw up your performance, and so there's no way to cheat. You still need to be an excellent singer to get close to the studio-perfected sound.

(If you're in the studio, of course, you can do overdubs, and you can use the more sophisticated Auto-Tune competitor Melodyne, which uses batch processing to allow smoother and more detailed corrections including pitch, rhythm and timbre, and in the most recent versions, chord manipulations.)


More "live" records are post-produced than you'd think--recording over mistakes or looping choruses where the singer just left it for the audience to sing.


I vaguely recall a story about this that said they were using auto-tune at Paul McCartney's actual concerts. Which would not surprise me if it is becoming a standard, simply because it is easy to do nowadays.


An additional subtlety not considered: a good sound engineer is also going to be paying very close attention to relative pitch when dealing with a professional ensemble, and less attention to absolute pitch.


I used Antares plugin back in 2000 - and usually would do it by hand for a final take. Occasionally I would use it in automatic mode but with limited parameters to not make it sound un-natural.

It is surprising that this stuff is only recently (ish) in mainstream attention - the technology has been around for a long time and was used to good effect occasionally.


> Notice there is no retune speed adjustment knob.

It's not as prominent as in the first one (and I'm prepared to believe it's not used hamhandedly by the skilled engineers), but it's there: bottom row, slightly right of centre.


From the article, regarding your first point:

But let’s assume that the Manhattan Transfer is trying to hide the use of Auto-Tune, in which case their recording engineer would presumably use a retune speed that approximates a “natural” value.


Retune speed is not a very relevant control when you aren't using auto-tune (proper name or generic) in fire-and-forget mode. If they're trying to cover up its use, they aren't using it in fire-and-forget mode. They're using it in graph mode, or they're using Melodyne, which has no retune speed adjustment.

Fire and forget mode: http://www.antarestech.com/images/ATEvo_Auto_mode.jpg

Notice the prominent 'retune speed' adjustment knob.

And here is the way everyone uses auto-tune these days, when you aren't going for the robot hiphop/Cher effect: http://www.antarestech.com/images/ATEvo_Graphic_full.jpg

Notice there is no retune speed knob in the main panel at all. You control the pitches by adjust them directly in the graph.

[I've edited my original post above to include some of this information.]


Anyone got a clip? I'm kind of doubting auto tune is being used but wouldn't be surprised if they were using something like a flange or a phase shift effect.

Both kind of get looped in with the "auto-tune" sound.


Listen to the samples on Amazon. The first track's clip, in particular, sounds like what the negative reviewer is complaining about.

http://www.amazon.com/gp/dmusic/media/sample.m3u/ref=dm_mu_d...

Maybe it's being used for effect?


try the sample of "500 miles high" here: http://www.amazon.com/Chick-Corea-Songbook-Manhattan-Transfe...

specifically the first two notes of the female lead (some-day) in the audio clip - some suspicious processing going on there.


I'm pretty sure that the "weird sound" we hear are the reverberation early reflections. Reverb processor introduces a slightly delayed and damped set of reflections (10 to 250 ms late), so what we hear is the first note attack, slightly lower than the sustained note (the singer starts a bit low and corrects herself along), that gives that effect.

I've been a sound engineer until 1996, and I know that auto-tune definitely use some formant-aware phase shifting, so that a slight pitch correction (up to a couple of commas) must be essentially inaudible if properly used. Heck, back in the 1980s when we only had Eventide H3000 to save a poor performance, we could make a convincing approximation of the auto-tune effect in post-production, and when the digitech vocalist MV-5 became available, it made a really decent job at real-time pitch-correction, and it was back in 1995. Do you really think that a state-of-the-art fx module used by a competent sound engineer in 2010 would be audible? No way, sir.


There's a whole bunch of interesting comments on this story, but nobody seems to address the central issue: is it probable that the use of auto-tune on an album of 'vocal jazz icons' would be detectable by an acute, skilled, educated-in-the-subject listener? (assuming it was intended to be used in a subtle way?)


I know on my amateur jazz band's album, we used Auto Tune. We're not ashamed of it either. Do we have time to practice the tunes 100 times or try a dozen takes? No way!

It does feel sloppy if professionals are using Auto Tune, especially in the genre of jazz, where talent is still very much front and center.


More grist from a review on Amazon: http://www.amazon.com/review/R15OHRG7XHGJK7/ref=cm_cr_dp_cmt...

Caveat emptor, YMMV, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: