> I didn't really grok why we use sines and cosines. As has been pointed out els...

zrav · on Nov 15, 2020

> So you would think that it would make sense to break the signal down into stuff with frequency and time. Some kind of wavelet probably. Maybe something that very accurately models what a human hears.

Interestingly wavelet based compression went nowhere because although they have nice mathematical properties, when applied in a lossy compression scheme, they did not fit well with how humans perceive detail/quality, both in terms of psychoaccoustics and psychovisuals, i.e. PSNR vs subjective quality diverged more than with other systems. Not surprisingly none of the state of the art lossy compression algorithms use wavelets.

aesthesia · on Nov 15, 2020

Oh, interesting. I always thought JPEG2000 failed to catch on mainly because of patent encumbrances, not for quality reasons.

zrav · on Nov 15, 2020

Yes, that didn't help of course. JPEG2000 was a bit more efficient than JPEG by virtue of being a newer, more computationally intensive format, not because of wavelets. A modern format like the h.265 derived HEIF still uses DCT/DST based transforms. For video the situation is worse, as AFAIK no-one has been able to come up with a decent wavelet based motion estimation algorithm.

tanderson92 · on Nov 15, 2020

For those wanting to learn more about this you can read about: Gabor theory.