Hacker News new | past | comments | ask | show | jobs | submit login
Accurate Image Alignment and Registration Using OpenCV (magamig.github.io)
166 points by magamig on March 9, 2022 | hide | past | favorite | 34 comments



Image alignment allows for some fun image manipulations like these. I think one of the coolest novel applications of image alignment in recent years must be multi-frame super-resolution. For example, the Pixel phones use it to improve low-light and zoomed in photos[0].

[0] https://sites.google.com/view/handheld-super-res/


This is called drizzle (with dithering, e.g. the raw photos are taken with offsets) in the astrophotography sphere and is very effective when the images are undersampled. Although, astronomy images have the advantage of being filled with stars that are relatively easy to align very accurately.


'When undersampled' -- time to get a more modern sensor?

3.76um pixels + 800mm focal length is about 1 arcsecond per pixel, so that will end up seeing limited much of the time.


Many undersample on purpose and get better results that way. You can workaround seeing a little with lucky imaging and patience, being very selective.

Also, drizzling was developed originally for the Hubble Deep Field, which isn't limited by seeing :).


I imagine the minuscule vibrations of a shutter make it a natural fit to the already existing image stacking pipeline, correct?


I'm not sure what you mean. Mechanical shutters have no upsides I'm aware of :). Astro cameras use electronic shutters. EDIT: The dithering, or offsetting, is done by moving the telescope slightly between exposures. During exposures you want it to track the object you're photographing as accurately as possible. I imagine vibrations are a detriment even in the Pixel phone processing. It would be better if the phone could move instantly in slightly different directions and take steady raw photos in each.


Ah, I was thinking of amateur astrophotography - although these days you can just set the mechanical shutter in many consumer DSLRs to stay open too I think.


I think maybe you were considering the vibrations a source of noise for compressed sensing? For CS to work well you generally want to know the convolved noise. So a super accurate accelerometer might be able to measure random vibrations and do even better when they are present than absent. Not sure if phone accelerometers are that good.


Yes, I believe Google's team publicly says it helps add just enough random translation.


Do camera phones have mechanical shutters? I'm almost certain they don't.


On camera phones the random movements come from the users own instability when taking pictures.


I wonder if you could use alignment of multiple samples for better vectorizing of letters of old books. You could create very nice vector only pdfs.


Yes! For a two-fer, take a pair of cell phone photos of the open book, compute a 3D model, and flatten it as if the book had been sliced and scanned. Now match up all the letter E's in the entire book for image enhancement.

Vectorization is nontrivial. Most academic journal articles are now poorly scanned, and it would be nice to simply improve the scans.


Cool! I'm planning on using image alignment for enhancing the spatial resolution of hyperspectral (HS) images, through the fusion of RGB and HS images.


You get something similar to a panchromatic image?


You can use an high-resolution panchromatic image to increase the spatial resolution of an RGB image. Here the objective is to do the apply the same idea but using an RGB and HS image, increasing the spatial resolution of the latter.


Does this mean that objects that are on a trajectory out of the view have less resolution?


I currently working on an optical respiratory monitor and one of the challenges was image alignment/registration for an optical and a thermal image with wildly different resolutions, fovs.

We ended up doing this by manually calibrating the images in matlab using a custom script and logging out a transformation matrix that we could then multiply the optical image by to get matching pixels in the thermal image.

Really fun project, but definitely a tedious thing to do, especially when its only a small part of the overall project. The project in the link also looks very cool, just not quite right for us as we probably wouldnt consistently have enough landmarks to map.


Direct methods could be useful in your case. https://pages.cs.wisc.edu/~dyer/ai-qual/irani-visalg00.pdf


At work I had to make a custom image registration pipeline, that uses only 2 degrees of freedom, so just x,y translation. OpenCV did not have anything that did this, but a python library called Kornia does this well.

https://kornia-tutorials.readthedocs.io/en/latest/image_regi...


Actually base-OpenCV has a great function for this: `cv2.findTransformECC()`: https://learnopencv.com/image-alignment-ecc-in-opencv-c-pyth...

It can do dense translation, translation + rotation, Affine, and Homography alignment; I've used it in the past to do sub-pixel Aruco/AprilTag alignment (and I'd probably also use it for astrophotography).


The opencv_contrib repo does have a module, called "reg", for direct alignment.


Maybe you could use this for when there are "enough" features, and keep using the manual method for the remaining situations. I'd have to take a look at the images to have a better idea of what you are dealing with.


The main issue is that the thermal camera we're using is only a few hundred pixels tall/wide, so using it for feature detection was rough when we did try that for something. For our purposes, with cameras on a solid mount, the hardcoded transformation matrix is good enough, but I think if you had better cameras and more time/more compute power, something like this could be better.

A bigger issue is that the project is running on a Pi, so just getting the image alignment running along with our face detection wouldve also been a pretty tall task if we want to stay at around ~5fps.


Note that this is exactly what panorama software like Hugin [0] does - it even comes with a convenient command-line tool for this simple use case (aligning a stack of images that are mostly overlapping): https://wiki.panotools.org/Align_image_stack

[0] http://hugin.sourceforge.net/



Would this work with images which are out of focus?


Probably yes, it works on 2p Neuro images, which are of course not InFocus


The biggest issue in real life use cases is lens distortion - these alignment algorithms usually don't correct barrel, pin-cushion or more complex lens distortions, making the alignment imperfect.


Any of the standard panorama software can solve for a standard simple lens distortion model, it just needs a few more alignment points to do so.


OpenCV, which is what the OP is using, has higher order camera calibration. There are functions to calibrate based on checkerboards or dot grids and undistort images using those calibrations.

Sometimes it's better to match first, then undistort, though.


I use a similar algorithm at work to detect if pages match a template and then align the image to OCR from mapped fields. Not perfect but pretty effective.


Affine transformations are very useful for algorithmic image rotation/translation/reflection [0].

[0] https://en.wikipedia.org/wiki/Affine_transformation


This is the kind of post I love to find in HN: an interesting topic and code. Does anyone know what is the theme of the blog? I am interested in how the references are created. I suppose it is based on something like RST.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: