Hacker News new | past | comments | ask | show | jobs | submit login
Rotating Images (datagenetics.com)
355 points by deletes on Aug 17, 2013 | hide | past | favorite | 44 comments



The unmentioned publication behind this method is: "A Fast Algorithm for General Raster Rotation" by Alan W. Paeth. A copy of which can be obtained here:

http://www.cipprs.org/papers/VI/VI1986/pp077-081-Paeth-1986....

While most explanations of the algorithm follow this blog post and imply that it is a three pass algorithm, if one carefully reads Paeth's paper, one realizes that the algorithm can be implemented as a one pass algorithm.


Another very good overview of rendering algorithms was given by Steven Wittens on his Making Webgl Dance[1, 2]

[1] slides: http://acko.net/files/fullfrontal/fullfrontal/webglmath/onli...

[2] video: http://www.youtube.com/watch?v=GNO_CYUjMK8


This is a fantastic speech and slides. Thanks for the link!


Can you expand on the one pass algorithm, or provide a link? There was a two pass variant at the end of the paper, but I didn't see any explicit one pass variant.


The solution to making a one pass algorithm from the paper is to realize that each pixel moves a fixed amount. First it moves horizontally, then vertically, then horizontally, to arrive at its final location.

The way the algorithm works, for each pixel, the amount of shear horizontally and vertically can be calculated up front, before "moving" the pixel.

So to make a one pass algorithm, allocate a second image of sufficient size to hold the rotated original (the size of this second image can also be calculated before any pixel "movement"). Then, for each pixel, calculate its horizontal shift, followed by its vertical shift, followed by its final horizontal shift. You now have its destination coordinates, so store it into the second image buffer at the destination coordinates.

The algorithm becomes essentially:

    foreach pixel source_coord do
        dest_coord = transform(source_coord)
        dest_image(dest_coord) = source_image(source_coord)
    done
where

    transform()
performs the Horiz/Vert/Horz transformation of the input coordinate.


The single pass is the first algorithm mentioned in the article that produces "holes". You have to split the operations to avoid them and thus make more passes.


The first algorithm mentioned in the article is not a one-pass Paeth algorithm. The first algorithm in the article is a pure trigonometry algorithm.

A one pass Paeth algorithm produces identical results to the typical three pass implementation, because there is only one round of trigonometry up front, so there is only one round of round-off. Once the shear amounts are calculated, the rest is just integer pixel shears.


The post reminds me of BYTE article from 1981, in the Smalltalk special issue. It was the first time I had ever seen recursion illustrated graphically, on a bitmap of Mickey Mouse's face [1].

Lots of good articles in that issue, and the whole thing is available on-line at the Internet Archive [2].

[1] Daniel H. H. Ingalls. The Smalltalk Graphics Kernel. BYTE 6(8), August, 1981, pp. 168--94.

[2] http://archive.org/stream/byte-magazine-1981-08/1981_08_BYTE...


Wonderful link! Off-topic, but I was particularly enthralled with the ads. Many of them are great, but my favorite was for "The Last One"[1]. "Say goodbye to the frustrations of computer programming. The Last One will be available very soon...The Last One is a computer program that writes computer programs. Programs that work the first time. Every time."

And for only $600.

I was surprised to see that this software actually shipped, according to Wikipedia. [2] Alas, it was not the last piece of software that people ever needed to use, apparently.

[1] http://archive.org/stream/byte-magazine-1981-08/1981_08_BYTE...

[2] https://en.wikipedia.org/wiki/The_Last_One_(software)


Nice article, this is HN material. My only "complain" is that he isn't using Lena as test images. ;-)


Guilty, but I make up for it in another article on JPEG :) http://www.datagenetics.com/blog/november32012/index.html


Wow, your hybrid optical illusion at the end of the article was fascinating, quite spooky/surprising.


Thanks for kind comments.


Thanks, I follow your twitter for a while and I love reading your posts! Keep up the good work!


While the result isn't great, it's a neat algorithm, and it will actually do gamma correction correctly. Sampling and weighing the four pixels will not, unless you specifically code for it. And looking at the history of gamma correction in software, most developers don't care about that.


That's not fair - you can look at this algorithm as doing nearest neighbor interpolation. No matter how you look at it, (bi-)linear interpolation will be more accurate, whether gamma corrected or not.


It is my understanding that gamma correction occurs in the display driver ( or possibly display itself, in some cases ). The colors that your application manipulates are the "raw" colors. Linearly interpolating in this space is correct. Gamma correction will be applied automatically later in the display pipeline.


Wouldn't the 3 shears method take a preposterous¹ amount of memory if you rotate by something like 180 degrees? The first shear will take a w.h image and output a w.h.h one!

Why not simply use Bresenham's line drawing algorithm to "draw" a reverse-rotated rectangle over your image, but rather than drawing, you'd read pixels and write them to the destination image. You'll need to only rotate the corners of the image once and everything else is simple integer math.

¹ preposterous for the kind of device where doing it any other way is too slow

Edit: replaced stars with dots for multiplication because stars make italic text


> Wouldn't the 3 shears method take a preposterous¹ amount of memory if you rotate by something like 180 degrees? The first shear will take a w.h image and output a w.h.h one!

If implemented as a three pass algorithm, yes.

If implemented as a one pass algorithm, no.


Why not simply use Bresenham's line drawing algorithm to "draw" a reverse-rotated rectangle over your image, but rather than drawing, you'd read pixels and write them to the destination image. You'll need to only rotate the corners of the image once and everything else is simple integer math.

Trivia: this is how Wing Commander worked. Nobody was concerned about filtering or gamma correction in those days, needless to say!

Bresenham point-sampling is an example of a 'forward transform', where you look at each pixel in the source image and decide where it goes in the destination image. You can also use conventional texture mapping to rotate an image, which is a case of an 'inverse transform' where you traverse each pixel in the destination image and figure out what texel(s) contribute to it.

Inverse transforms are much simpler -- Bresenham sampling has some ugly complexities, especially if you want to scale the image as well as rotate it -- but they aren't as efficient if the image has a lot of transparency.


Clearly there are better ways to rotate by 180 degrees!

If you need to rotate by 179, first rotate by 180, then by -1.


Another popular method of doing this is a variant on the area mapping technique, except that instead of taking the weighted average of the four nearest pixels, it instead creates a 2-D spline across the R/G/B "heights" of the image, and interpolate with the spline.

The reason the triple-sheared images look crappy is because the three shears themselves are subject to pixel aliasing. Still pretty cool.


I would still rather use a simple texture mapping - map the image to a rectangle and then just rotate the 4 corner points. Rest is done naturally by your texture mapper. You can implement a texture mapper easily by finding the top-most rotated corner, take the two edge lines coming out of that corner, linearly interpolate texture coordinates on these lines for each horizontal line they cross and then simply walk linearly between those two texture positions spread across the horizontal line range. Then replace the edge line whose endpoint you reached by the next line until you reach the bottom-most corner. You can add surrounding texels to the mix to get the bilinear or another filtering. Also, Bresenham's line algorithm can be used for each of the line interpolations (though my assembly language experience showed that fixed point ints were actually faster). This is also embarrassingly parallel.


This is really neat, but seems to neglect an alternative (and just as simple) method. Instead of traversing through each pixel of the source buffer, traverse through each pixel of the destination buffer sampling the correct pixels from the source. Using different sampling methods results in various qualities, and efficiencies in cpu use and memory access.

This method is as simple to code and doesn't suffer from the missing-pixel aliasing problem of the simple method of the article, and is also capable of higher quality results than the shearing method.


You have to traverse sqrt(2) times more pixels than in the original method unless you somehow know where the borders of the destination square are ( additional computation ).


This is true only if the source image is smaller than the destination image. In the article, the source image is shown as the same size as the destination, and includes a large white border that can be clipped without the inner image being affected.

In the event that the destination buffer is much larger than the source, that additional computation is trivial (it's the same calculation that's already being done for each and every pixel). As it only needs to be additionally done on the corners, not per-pixel, the additional time spent should be quite minimal.

The shearing method in the article is genuinely clever and totally cool, but I just can't shake the feeling that even on 1980s hardware, this method would be better. On modern hardware, there's no question, it's still used to this day. Nowhere near as cool, though.


Could you post some links explaining your method?


I wonder if the onset of Retina-like displays would remove some of the issues with antialiasing when rotating bitmap images. I'm tired of looking at the digital world straight on.


So when you do a shear, the pixels are only being moved horizontally or vertically in each stage, so therefore no gaps appear. Is that it?


but they still need interpolation because pixels in general won't exactly overlap.

as far as i can see it's replacing the interpolation in 2d with multiple interpolations in 1d. but perhaps i am missing something.

[edit] ok, after skimming the paper, it seems that they approximate the interpolation (using nearest neighbour). which makes it very fast. so really it's a neat way of working out what nearest neighbour is in 2d by doing it in 1d multiple times.


Note that one could also do "area mapping" with nearest neighbor instead of 4-pixel interpolation (if speed is more important than smoothness) to avoid the white dots. This is the most obvious "naive" approach to me -- basically the same cost as the white dots approach (loop over every destination pixel just once, grab just one pixel from source) but with some repeated neighboring pixels instead of missing information.


Um, these results look pretty horrible. Where are the pictures of area mapping that he mentions? (Where the missing holes are filled by doing averaging of four source pixels). Anyone knows what that looks like by comparison?


Area mapping certainly makes better images, but it's slower. These "horrible results" actually remind me of older games and some pc demos.


Rotation clockwise by 27°

http://i.imgur.com/itniCWR.png

Left is my area mapping algorithm and right is adobe photoshop bilinear rotation.


As I understand his description of the area mapping algorithm, this is actually a bilinear interpolation from the resulting, rotated, image over the original image.


There's a Java applet comparing various high quality techniques here: http://bigwww.epfl.ch/demo/jrotation/index.html


a little anti-aliasing would go a long way in these rotated images


Ok but, 126% is not a good rounding of 1.2956...


It is an obvious typo if you do the math.

Angle = 27°

result = 1/(cos Angle )^2 = 1.25961

which is rounded to 126%


yup. Anyhow, nice entry :)


Oooops, yes, a typo. I'll correct it later. :)


Appart from the aliasing issue, isn't there also a way to use complex numbers to do the rotation?


Yeah, you can view complex numbers as 2d vectors. In this view, you can choose a complex number with a magnitude of 1, at some angle θ to the x-axis (the real line). If you multiply this number by another complex number, it will rotate that number by the angle θ. We can express such a complex number as cosθ + i sinθ.

If you express such a complex number as a 2x2 matrix (there is a unique 2x2 matrix for each complex number, such that the complex number a + ib becomes the matrix

    [ a, b
     -b, a ]
you'll notice that the matrix form of our unitary (magnitude 1) complex number is just a rotation matrix like the one discussed in the article.


Very cool!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: