In the time domain, you're absolutely right, Nyquist–Shannon proves that we don't benefit from sampling higher than twice the upper bound of human hearing. Rates like 96kHz or 192kHz are useless for quality (although in specific live situations, they have utility in that the latency would be half or quarter that of typical rates, for a given buffer size in samples).
In the amplitude domain, CD quality (16 bits/sample) is pretty damn great for typical mastered audio (~96dB undithered, ~120dB dithered) but if the audio is unusually dynamic (i.e., loudness is not maximized) like with a symphony orchestra, you might actually turn up your sound system to the point where quantization noise becomes evident in the quietest parts, at which point you absolutely would benefit from more bits per sample (say, 24, yielding ~144dB of range undithered).
So in practice, you're right, CDs are fine; but "humans can't detect beyond that" only truly applies in the frequency domain. We can detect latency (not applicable to playback of recordings) and we can detect quantization noise when cranking up the volume knob.
Meta: I have no idea why someone downvoted you, and gave you an upvote because your explanation is accurate, and unusually well-written, IMHO. Although I'm curious what you mean about "detecting latency". You mean between two sources, for positioning? I'm not sure the relevance of that here.
> You mean between two sources, for positioning? I'm not sure the relevance of that here.
It's only really meaningful in scenarios like a recording studio where the performer is listening to themselves live, in headphones, if the signal undergoing A/D and D/A conversion with some processing in between. It could mean 2ms instead of 8ms, which can be the difference between distracting and not. Not relevant here (I admitted "not applicable") but just an example of where some utility does exist, but the utility is definitely not audible frequency response.
In the time domain, you're absolutely right, Nyquist–Shannon proves that we don't benefit from sampling higher than twice the upper bound of human hearing. Rates like 96kHz or 192kHz are useless for quality (although in specific live situations, they have utility in that the latency would be half or quarter that of typical rates, for a given buffer size in samples).
In the amplitude domain, CD quality (16 bits/sample) is pretty damn great for typical mastered audio (~96dB undithered, ~120dB dithered) but if the audio is unusually dynamic (i.e., loudness is not maximized) like with a symphony orchestra, you might actually turn up your sound system to the point where quantization noise becomes evident in the quietest parts, at which point you absolutely would benefit from more bits per sample (say, 24, yielding ~144dB of range undithered).
So in practice, you're right, CDs are fine; but "humans can't detect beyond that" only truly applies in the frequency domain. We can detect latency (not applicable to playback of recordings) and we can detect quantization noise when cranking up the volume knob.