Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Resize images using the FFT (code.google.com)
63 points by eliteraspberrie on Nov 6, 2013 | hide | past | favorite | 35 comments



ImageMagick can do this too:

  convert -filter Sinc -resize 200% input.png output.png
However, as pointed out by other commenters, sinc by itself isn't the greatest support function for enlargement due to the ringing artifacts [0]. Which are pretty visible in your example image [1].

ImageMagick has way more documentation on the choice of resize filter than you'll ever need [2]. But it's interesting reading nonetheless; they settled on different default filters for shrinking (Lanczos) and enlargement (Mitchell).

[0] http://www.imagemagick.org/Usage/filter/#ringing

[1] http://i.imgur.com/JDPvHjf.png

[2] http://www.imagemagick.org/Usage/filter/


If anyone is curious, the algorithm is: 1) take the FFT of the image; 2) add zeros after the highest frequencies (remember to shift the data so the zero frequency is centred); and 3) inverse the shift and take the inverse FFT. This is called "zero-padding in the frequency domain."


You want to be using a DCT rather than a DFT most of the time with content like this, in order to avoid those boundary artifacts. Compare with the example on the page http://i.imgur.com/wOVMwxx.png

Source: I once maintained a small C library to do this, but eventually gave it up when it became clear how similar it was to the many existing sinc resampling implementations out there. That said, never stop experimenting.


Very neat. You may want to do something more to the edges (mirror, for instance?) to avoid high-frequency ringing artifacts there. Remember that the Fourier transform describes a repeating signal, so the borders (out into space) appear as impulses with lots of high frequency components. If you mirror, it should clean up the noise that's particularly prominent on the top and right.


Great idea. I'll definitely implement mirroring.


This approach will inherently have the ringing; FFT is really not meant for local features. STFT may be a more interesting approach.

Anyhow, I ran something like the following in Matlab for testing upsampling 4 pixels to 8 pixels:

  fft_mx = fft( eye(4) ); #forward FFT transform matrix
  fft_mx = padarray( fft_mx, [2 0], 0, 'both' ); #pads result vector with zeroes
  fft_iv = ifft( eye(8) ); #inverse FFT matrix
  abs( fft_iv * fft_mx ) * 2
The result is:

    1.0000         0         0         0
    0.6533    0.6533    0.2706    0.2706
         0    1.0000         0         0
    0.2706    0.6533    0.6533    0.2706
         0         0    1.0000         0
    0.2706    0.2706    0.6533    0.6533
         0         0         0    1.0000
    0.6533    0.2706    0.2706    0.6533
Basically, your idea (which boils down to an FIR scaler) samples the entire image when resampling. That isn't a very good behavior, IMHO.

EDIT: fixed math


is this the same as sinc-interpolation? (effectively; that happens in "real" space, while this seems to be in frequency space)


As far as I can tell from the code, this is exactly the same. The approach, bandlimited interpolation, is executed in the frequency domain here and in the space domain in sinc interpolation.

Bandlimited interpolation is "correct" in some technical sense for sampled signals and images, making some assumptions about the sampling procedure. That doesn't imply it's perceptually optimal, though. If your goal is "best approximation of what I sampled given the available data", then bandlimited interpolation is good. If your goal is "image that looks good", it probably isn't.

I'm not crapping on this contribution, though, it's always cool to see more signals stuff open sourced.


To be more clear: a sinc reconstruction is correct in the sense that if the original signal was bandlimited before being sampled, the sinc reconstruction will perfectly reproduce the bandlimited input signal. All other reconstructions can produce aliasing (how much and whether it is objectionable obviously depends on the input signal). In cases where the original signal was not bandlimited concepts like "perceptually optimal" come into play and a different reconstruction filter may be preferable.


Sinc interpolation is usually also windowed with a dampening factor on the tails of your filter. I think the ringing on the left edge of the image shows the lack of windowing.


Yes, I think so, although it's been a couple years since I recall the proof.



Interesting, it looks like the same algorithm, but does it work with 2D data (images)?

Edit: It looks like that function only works with one-dimensional data (along a single axis): https://github.com/scipy/scipy/blob/v0.13.0/scipy/signal/sig...


I love that you can see the edge effects from iFFT(FFT(X)), its always interesting when you can see visual evidence of some underlying math. Anyways, I wonder if you could use a wavelet transform to reduce this.


Nice, but why not add a readme saying how to install the dependencies? Or maybe they are easy to install, I'll give it a try.

Also it would be nice to include all the docs (hah, minimal as they are) in the repo.


Oops, I'll add documentation and upload it all to PyPI today.

The dependencies are: Python 2, NumPy and matplotlib. The script can't really be installed at the moment, use it like so:

    git clone https://code.google.com/p/fftresize
    cd fftresize
    python fftresize.py /path/to/image.jpg 2.0
Edit: It's on PyPI now! https://pypi.python.org/pypi/FFTresize


Interesting. But if I understand correctly, this will not retain images sharpness? It's not easy to see with just a 2x increase.


Yes, the sharpness does decrease with the interpolation. I think the phenomenon is called "overshoot" and I don't know how to mitigate that.

Here is the image interpolated 4X: http://imgur.com/a/8YrxG


It's actually called "ringing" and can easily be explained by the fact that you convolved with a box-function in the frequency space (ideal low pass), which will transform into a sinc function in the local space.

http://www.thefouriertransform.com/pairs/box.php


I think you've convoluted the terminology. It's "convolve".


what? :)


You can actually see it if you look closely -- there is a kind of striping of light/dark horizontal lines, easy to spot at the top & bottom of the image.


also known as the gibbs phenomenon.


Thanks. I'll add a window function and see if that helps.


Try it with a checkerboard pattern--somewhat the pathological case.

Gray cats are somewhat smooth from a signal standpoint.


Good idea. Here's the result: http://imgur.com/a/EHNH0

Edit: I'll be adding a window function soon to attenuate the ringing, it'll be interesting to see the difference that makes.


you can also try to apply a smoothening function: Usually multiplying by a gaussian in phase space will do, but this will cause your image to be blurred. What would be really impressive is if you could interpolate a pattern in the high frequency signal, and use it to "infer" values for your high frequency padding that "naturally cancel out" ringing to below the resolution of the imag.e


A beautiful example of ringing. :)


I'm surprised that the ringing isn't that bad even without a window function.


Google doesn't let me access this without cookies enabled....


Oops. You can clone the git repository:

    git clone https://code.google.com/p/fftresize/


enhance?


No, no no, it's all in the confidence. The algorithm only works if you firmly declare, "Zoom! Enhance!"


Uncrop.


enhance




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: