Hacker News new | past | comments | ask | show | jobs | submit login

I have trained GANs on raw JPEG coefficients with moderate success as a pet project. Read the raw DCT coefficients without decompressing and train in this space. JPEG-decompress the output of the net to reconstruct an image in the pixel space. There are few papers doing similar things for supervised learning tasks iirc



Using a standard loss function like MSE?

Yeah it kinda works when you feed JPEG coefficients into a typical time-domain CNN, but mathematically it seems that if you're using frequencies as inputs, your convolutions should become simple multiplications. Am I wrong?


Yep, you can successfully do it without convolutions. Here is a pointer if you want to dig deeper: https://eng.uber.com/neural-networks-jpeg/ (there's prior art to that, but this one is well written)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: