Hacker News new | past | comments | ask | show | jobs | submit login

So streaming services can save money on bandwidth



That's absurd. I think anybody is aware that it is far superior to e.g. compress in the frequency domain than to down sample your image. If you don't believe me just compare a JPEG compressed image with the same image of the same size compressed with down sampling. You will notice a literal night and day difference.

Down sampling is a bad way to do compression. It makes no sense to do NN reconstruction on that if you could have compressed that image better and reconstructed from that data.


An image downscaled and then upscaled to its original size is effectively low-pass filtered where the degree of edge preservation is dictated by the kernel used in both cases.

Are you saying low-pass filtering is bad for compression?


The word is "blur." Low-pass filtering is blurring.

Is blurring good for compression? I don't know what that means. If the image size (not the file size) is held constant, a blurry image and a clear image take up exactly the same amount space in memory.

Blurring is bad for quality. Our vision is sensitive to high-frequency stuff, and low-pass filtering is by definition the indiscriminate removal of high-frequency information. Most compression schemes are smarter about the information they filter.


> Is blurring good for compression? I don't know what that means.

Consider lossless RLE compression schemes. In this case, would data with low or high variance compress better?

Now consider RLE against sets of DCT coefficients. See where this is going?

In general, having lower variance in your data results in better compression.

> Our vision is sensitive to high-frequency stuff

Which is exactly why we pick up HF noise so well! Post-processing houses are very often presented with the challenge of choosing just the right filter chain to maximize fidelity under size constraint(s).

> low-pass filtering is by definition the indiscriminate removal of high-frequency information

It's trivial to perform edge detection and build a mask to retain the most visually-meaningful high frequency data.


Do you seriously think down sampling is superior to JPEG?


No. I never made this claim. My argument is pedantic.


Are you saying that when Netflix streams a 480p version of a 4k movie to my TV they do not perform downsampling?


Yes. Down sampling makes only sense if you store per pixel data, which is obviously a dumb idea. You get a stream for 480p which contains frames which were compressed from the source files, or the 4k version. At some point there might have been down sampling involved, but you never actually get any of that data, you get the compressed version of those.


Not sure if I’m being dumb, or if it’s you not explaining it clearly: if Neflix produced low resolution frames from high resolution (4k to 480p), and if these 480p frames are what my TV is receiving - are you saying it’s not downsampling, and my TV would not benefit from this new upsampling method?


Your TV never receives per pixel data. Why would you use a NN to enhance the data which your TV has constructed instead of enhancing the data it actually receives?


OK, I admit I don’t know much about video compression. So what does my TV receives from Netflix if it’s not pixels? And when my TV does “upsampling” (according to the marketing) what does it do exactly?


It receives information about the spacial frequency content of the image. If you're unfamiliar, it's definitely worth looking into the specifics of how this works, as it's quite impressive! Here's a few relevant Wikipedia articles, and a Computerphile video:

https://en.wikipedia.org/wiki/JPEG#JPEG_codec_example

https://en.wikipedia.org/wiki/Discrete_cosine_transform

https://www.youtube.com/watch?v=Q2aEzeMDHMA


I think you're missing the point of this paper—the precise thing it's showing is upscaling previously downscaled video with minimal perceptual differences from ground truth.

So you could downscale, then compress as usual, and then upscale on playback.

It would obviously be quite attractive to be able to ship compressed 480p (or 720p etc) footage and be able to blow it up to 4K at high quality. Of course you will have higher quality if you just compress the 4K, but the file size will be an order of magnitude larger.


Why would you not enhance the compressed data?


In our hypothetical example, the compressed 4k data or the compressed 480p data? You would enhance the compressed 480p—that's what the example is. You would probably not enhance the 4K, because there's very little benefit to increasing resolution beyond 4K.


Or low connectivity scenarios that pushes more local processing.

I think it a bit unimaginative to see no use cases for this.


There is no use case, because it is a stupid idea. Downscaling then reconstructing is a stupid idea for exactly the same reasons why downscaling for compression is a bad idea.

The issue isn't NN reconstruction, but that you are reconstructing the wrong data.


if the nn is part of the codec, you can choose to only downscale the regions that get reconstructed correctly.


Why would you not let the NN work on the compressed data? That is actually where the information is.


that's like asking why you don't train a llm on gzipped text. the compressed data is much harder to reason about


Meh.

I think upscaling framerate would be more useful.


TVs already do this… and it's basically a bad thing




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: