> No tricks like decoding a smaller image from a JPEG Given that most cameras ar...

jpap · on July 7, 2017

They would likely get another big speedup by doing this. iDCT gets faster as you perform a "DCT downscaling" operation because you require fewer add/mul [1].

You could probably go for another speedup, independently of DCT downscaling, by operating in YCbCr before a colorspace conversion to RGB. For example, for 4:2:0 encoded content (a majority of JPEG photographs), you end up processing 50% less pixels in the chroma planes.

When you combine both techniques, you can have your cake and eat it too: for example, to downsample 4:2:0 content by 50% you can do a DCT downscale on only the Y plane, keeping the CbCr planes as they are before colorspace conversion to RGB. No lanczos required!

If you need a downsample other than {1/n; n = 2,4,8}, you can round up to the nearest integer n then perform a lanczos to the final resolution: the resampling filter will be operating on a lot less data.

On quality I once saw a comparison roughly equating DCT downscaling to bilinear (if I can find the reference I'll update this comment). With the example above, it really depends on how you compare: if you compare to a 4:2:0 image decoded to RGB where the chroma is first pixel-doubled or bicubic-upsampled before conversion to RGB then downsampled, it might be that the above lanczos-free technique will look just as good because it didn't modify the chroma at all. Ultimately it's best to try-and-compare.

Lastly you could leverage both SIMD and multicore by processing each of the Y, Cb, and/or Cr planes in parallel.

[1] http://jpegclub.org/djpeg/

jacobolus · on July 7, 2017

That’s a shortcut if you only ever have to downsample by powers of two and you don’t mind worse image quality, since your down-sampled picture won’t use any data from across block boundaries.

mark-r · on July 7, 2017

You can use it in a multi-step process. Use JPEG blocks to get slightly above the target size, then Lanczos to finish it off.