Hacker News new | past | comments | ask | show | jobs | submit login

Only slightly OT:

ELI5: Why is a typical phone call today less intelligible than a 8kHz 8-bit μ-law with ADPCM from the '90s did?

[edit]

s/sound worse/less intelligible/




Depends on your call; u-law has poor frequency response and reasonable dynamic range. Not great for music, but ok enough for voice, and it's very consistent. 90s calls were almost all circuit switched in the last mile, and multiplexed per sample on digital lines (T1 and up). This means very low latency and zero jitter; there would be a measurable but actually imperceptible delay versus an end to end analog circuit switched call; but digital sampling near the ends means there would be a lot less noise. Circuit switching also means you'd never get dropped samples --- the connection is made or its not, although sometimes only one-way.

Modern calls are typically using 20 ms samples, over packet switched networks, so you're adding sampling delay, and jitter and jitter buffers. The codecs themselves have encode/decode delay, because they're doing more than a ADC/DAC with a logarithm. Most of the codecs are using significantly fewer bits for the samples than u-law, and that's not for free either.

HD Voice (g.722.2 AMR-Wide Band) has a much larger frequency pass band, and sounds much better than GSM or OPUS or most of these other low bandwidth codecs. There's still delay though; even if people will tell you 20-100ms delay is imperceptible, give someone an a/b call with 0 and 20 ms delay and they'll tell you the 0 ms delay call is better.


> and multiplexed per sample on digital lines (T1 and up).

Technically even on ISDN because you had channel bonding there. Although it's all still circuit switched. The timeslot within the channel group is reserved entirely for your use at a fixed bandwidth and has full setup and tear down that is coordinated out of band.


> HD Voice (g.722.2 AMR-Wide Band) has a much larger frequency pass band, and sounds much better than GSM or OPUS or most of these other low bandwidth codecs.

At what bitrate, for the comparison to Opus?

And is this Opus using LACE/NoLACE as introduced in version 1.5?

...and is Meta using it in their comparison? It makes a huge difference.


That is what I was wondering. They don't share what version of Opus they are comparing against, and that new version was a huge step forward.


Yeah, I probably shouldn't have included Opus; I'm past the edit window or I'd remove it with a note. I haven't done enough comparison with Opus to really declare that part, and I don't think the circumstances were even. But I'm guessing the good HD Voice calls are at full bandwidth of ~ 24 kbps, and I'm comparing with a product that was said to be using opus at 20 kbps. Opus at 32kbps sounds pretty reasonable. And carrier supported HD voice probably has prioritization and other things going on that mean less loss and probably less jitter. Really the big issue my ear has with Opus is when there's loss.

I don't think I've been on calls with Opus 1.5 with lace/no-lace, released 3 months ago, so no, I haven't compared it with HD voice that my carrier deployed a decade ago. Seems a reasonable thing for Meta to test with, but it might be too new to be included in their comparison as well.


> Really the big issue my ear has with Opus is when there's loss.

That would definitely complicate things. Going by the test results that got cited on Wikipedia, Opus has an advantage at 20-24, but that's easy enough to overwhelm.

And the Opus encoder got some other major improvements up through 2018, so I'd be interested in updated charts.

Oh and 1.5 also adds a better packet loss mechanism.


could you explain a little more in a more ELI5 please?


Old way: two cups and string.

New way: chinese whispers.


Phone ear speakers are quieter than they used to be, so if the other person isn't talking clearly into the mic, you can't crank it up. I switched from a flip phone to an iPhone in 2013, huge difference. I had to immediately switch to using earbuds or speakerphone. Was in my teens at the time.


Hearing ability deteriorates with age.


Yes, but it doesn't deteriorate in such a way as to cause someone speaking to sound like gibberish and/or random medium-frequency tones, which happens in nearly every single cell phone conversation I have that lasts more than 5 minutes.

My experience is that phone calls nowadays alternate between a much wider-band (and thus often better sounding) experience and "WTF was that just now?"


packet switching drops packets, circuit switching drops attempted calls (all circuits are busy)

most 90s calls didn't use adpcm, just pcm. assuming you got confused there

they also didn't use radio; solid wire from your microphone to my receiver. radio (cellphones, wifi, cordless phones) also is inherently unreliable

old phones had sidetone, but many voip apps don't

finally, speakerphone use is widespread now, and it is incompatible with sidetone and adds a lot of audio multipath fading


Does decrease in intelligibility correlate with the instance count of concert seats in front of the loud speakers back in the oughts?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: