Hacker News new | past | comments | ask | show | jobs | submit login

Why go that way. I’m no digital signal processing expert, but images (and series thereof, i.e videos) are 2D signals. What we see is spatial domain and analyzing pixel by pixel is naive and won’t get you very far.

What you need is going to frequency domain. From my own experiment in university times most significant image info lays in lowest frequencies. Cutting off frequencies higher than 10% of lowest leaves very comprehensible image with only wavey artifacts around objects. You have plenty of bandwidth to use even if you want to embed info in existing media.

Now here you have full bandwidth to use. Start with frequency domain, set expectations of lowest bandwidth you’ll allow and set the coefficients of harmonic components. Convert to spatial domain, upscale and you got your video to upload. This should leave you with data encoded in a way that should survive compression and resizing. You’ll just need to allow some room for that.

You could slap error correction codes on top.

If you think about it, you should consider video as - say - copper wire or radio. We’ve come quite far transmitting over these media without ML.




We started with that approach, by assuming that the compression is wavelet based, and then purposefully generating wavelets that we know survive the compression process.

For the sake of this discussion, wavelets are pretty much exactly that: A bunch of frequencies where the "least important" (according to the algorithm) are cut out.

But that's pretty cool, seems like you've re-invented JPEG without knowing it, so your understanding is solid!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: