Hacker News new | past | comments | ask | show | jobs | submit login

Keep in mind there are 2 axes.

In the time domain, you're absolutely right, Nyquist–Shannon proves that we don't benefit from sampling higher than twice the upper bound of human hearing. Rates like 96kHz or 192kHz are useless for quality (although in specific live situations, they have utility in that the latency would be half or quarter that of typical rates, for a given buffer size in samples).

In the amplitude domain, CD quality (16 bits/sample) is pretty damn great for typical mastered audio (~96dB undithered, ~120dB dithered) but if the audio is unusually dynamic (i.e., loudness is not maximized) like with a symphony orchestra, you might actually turn up your sound system to the point where quantization noise becomes evident in the quietest parts, at which point you absolutely would benefit from more bits per sample (say, 24, yielding ~144dB of range undithered).

So in practice, you're right, CDs are fine; but "humans can't detect beyond that" only truly applies in the frequency domain. We can detect latency (not applicable to playback of recordings) and we can detect quantization noise when cranking up the volume knob.




Right, but of course on most recordings (including Neil's own), that seems very unlikely to be an issue.

More significantly, though, the Pono format took about 30 times the amount of data as lossless CD quality. While you are talking more like 1.5 times.

https://www.mouser.com/applications/high-resolution-audio/


Meta: I have no idea why someone downvoted you, and gave you an upvote because your explanation is accurate, and unusually well-written, IMHO. Although I'm curious what you mean about "detecting latency". You mean between two sources, for positioning? I'm not sure the relevance of that here.


Thanks. I love the stuff!

> You mean between two sources, for positioning? I'm not sure the relevance of that here.

It's only really meaningful in scenarios like a recording studio where the performer is listening to themselves live, in headphones, if the signal undergoing A/D and D/A conversion with some processing in between. It could mean 2ms instead of 8ms, which can be the difference between distracting and not. Not relevant here (I admitted "not applicable") but just an example of where some utility does exist, but the utility is definitely not audible frequency response.


>latency would be half or quarter that of typical rates, for a given buffer size in samples

Would reducing the buffer size be another way to solve this?


Yes, but buffer of some particular (possibly noticable) size is a necessary evil to avoid underrun glitches.


Is it the the size of the buffer in samples, or in milliseconds that matters?


In my experience, samples, although I don't know exactly why. But it's why you can get lower latency glitch-free audio with higher sample rates.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: