Hacker News new | past | comments | ask | show | jobs | submit login
How L1-minimization can recover highly degraded images (wired.com)
83 points by TrevorBurnham on Feb 23, 2010 | hide | past | favorite | 44 comments



Only in the world of cookie-cutter journalism could this be described as a "chance discovery"-- a professor intentionally tests an algorithm, hoping to solve exactly the problem described, and discovers it works better than expected.

From the article: "Compressed sensing was discovered by chance... Candes... was experimenting with a badly corrupted version of the phantom meant to simulate the noisy, fuzzy images you get when an MRI isn’t given enough time to complete a scan. Candes thought a mathematical technique called l1 minimization might help clean up the streaks a bit... Candes expected the phantom on his screen to get slightly cleaner. But then suddenly he saw it sharply defined and perfect in every detail..."


Also, this (again, from the article) is just bizarre:

"At present, most of our audiovisual info is stored in sophisticated compression formats. If, or when, the format becomes obsolete, you’ve got a painful conversion project on your hands. But in the CS future, Candès believes, we’ll record just 20 percent of the pixels in certain images, like expensive-to-capture infrared shots of astronomical phenomena. Because we’re recording so much less data to begin with, there will be no need to compress. And instead of steadily improving compression algorithms, we’ll have steadily improving decompression algorithms that reconstruct the original image more and more faithfully from the stored data."

Someone, Candès or the author, doesn't seem to understand what they're talking about.


Why do you say that? Seems like a reasonable application of compressed sensing.


Because what they are describing is compression, I mean its even called compressed sensing. I'm guessing that somewhere lost in the journalism is the suggestion that its better to place more computation on the decompression, which might be true in some application.

It seems like a great bit of technology, I don't think the author needed to sex it up with mention of compressing images. The applications where its actually hard to do sensing at all (like the MRI example in the article) are much more compelling. It seems like its main use seems to be in turning sparse data into its most human friendly form, but I do wonder about making clinical decisions based on such an aggressive interpolation rather than actual data.


I think Candes (and/or the journalist) is contrasting sampling in an incoherent basis with the conventional route of compression (sample the full signal/image/whatever, then try to reduce its size). The former is a form of compression, just not the kind we usually do.


You can't generate information from nothing. Compression based on sparsity seems to be making assumptions about the level of detail in a picture[1]. If you stored 20% of the pixels in Obama's face you may miss a spot of acne.

I can't imagine trusting to this technique for pictures of the galaxy. After spending a bazillion dollars on a space telescope we aren't going to "throw away 80% of the pixels".

[1] - I'm not certain of this deduction. Can somebody with more background confirm or deny?


You're not exactly making assumptions about the level of detail. You're making assumptions about the compressibility and entropy of the data. A circle requires the same amount of data to represent whether its radius is R or 2R[1]. L1-minimization tries to find the smallest number of circles regardless of size, whereas traditional compression methods look for the largest circles regardless of number. It attempts to minimize the entropy of the result image (roughly), and all real pictures that aren't literally static have low entropy.

A saw a presentation on this technique in college. Some of the details I saw escaped me, but one thing I remember is that the L1 decompression is optimal when it is "orthogonal" to the compression that it is reversing. Fourier-based compression schemes (including JPEG and MPEG, and even MP3) are "orthogonal" in the correct sense because they select based on size, preserving large structures better. Random sampling is "orthogonal" to basically all compression schemes, and tends to select more information from large structures as well.

With a high-res bazilion-dollar space telescope, you're not going to throw away 80% of the pixels and reconstruct an approximation. Rather, you'll take all of the recorded pixels, and then reconstruct even more impressive astronomer porn at five times the resolution!

[1] Well, close enough. If you store the circle as the (x,y) of the center and an integer radius R, then 2R takes 1 more bit. Storing R as floating-point means no difference. I don't know about other representations. Anyways, the point is, the image size of the feature is not strongly coupled to data size of the feature.


"more impressive astronomer porn at five times the resolution!"

As long as we keep in mind that the filled in regions of space or background match the assumed entropy, I think it's worth doing. The applications of L1 appear to be limited to cases where the structure and assumptions match the dataset.

Maybe there's an underlying pattern we're recreating? Something inherent to physical laws, fractal replication, etc.

Your comment helped me to see that L1 isn't just a matter of interpolation but an assumed underlying pattern that is used to fill in or estimate the unknown data. Thanks.


The article focuses on "missing pixels" which doesn't have much to do with the original Candes paper. The usual setup for compressed sensing is that you're sampling the entire image one coefficient at a time in some "incoherent" basis (ie, noiselets). Thus, every little spot of Obama's acne may very well be reconstructed-- though more generally you're right, if the image is too sharp then something gets lost.


Isn't this a case where entropy shows it isn't possible? Once you degrade the quality/quantity of the information, you can't get it back. You can find a clever way to hide what you did, but this doesn't add information, you're just interpolating. This isn't bad however, as we normally don't use all data we could (do you look at your pictures with 400% zoom factor?), and a rescaled image can highlight some detail you would have missed.

By the way, those bazillion dollars aren't wasted, because if you take the image, don't throw away pixels, but do as if it were a bigger image which had lost pixels, you can get something useful out of it.


> Isn't this a case where entropy shows it isn't possible?

Most real world pictures are not random. So you can guess what it looks like from areas near it.


More precisely, the entropy of real world pictures is considerably lower than the entropy of all pictures (the vast majority of pictures are static).


This is more of a case where you aren't actually using all the information-bearing capacity of your pixels but are actually sampling from a much lower dimensional manifold. This isn't just interpolation and information need not be lost.


For a much better explanation of this stuff from Terry Tao (mentioned in the article as being responsible for much of the theoretical basis of the technique; Fields medallist; very very smart chap) see http://terrytao.wordpress.com/2007/04/13/compressed-sensing-... .

One important way in which I think the Wired piece is Just Plain Wrong: you don't sample a small fraction of the individual image pixels. You take a small number of samples, each of which is a kinda-random combination of image pixels.

Some examples and further quite-friendly explanation here: http://www.acm.caltech.edu/l1magic/examples.html .

More from Tao: http://terrytao.wordpress.com/2008/12/05/the-uniform-uncerta... and http://terrytao.wordpress.com/2009/05/25/reflections-on-comp... both of which are mostly links to more detailed things, well worth reading if you don't mind mathematics.


Thanks for the link; that made a lot more sense. The way Wired described it made it sound like the L1 minimization was almost irrelevant. The closest to a pseudocode I could get out of their article was something like: 1) apply an L1 minimization to do a first-cut noise reduction on some random noisy data; 2) apply an iterative algorithm to the result until convergence. That made it sound like all the magic was in the invention of some crazy algorithm for #2, which isn't really the case.


Here is a more detailed explanation of how this works.

Imagine you have a pixel image, with missing data, and with noise. Imagine you know a-priori that the image was generated simply by placing a few white circles against a black background. Then data was lost/discarded, and white noise was added to the remaining data.

(The white circles are cross sections of the patient's blood vessels or some such feature (bile ducts, in this case), and contain a contrast agent. The contrast agent appears white on the MRI, everything else appears black. )

Now that you know the pixel image has such a simple form, you don't need to do fancy noise removal to clean it up. All you need to do is figure out where the white circles are. Once you know where the are, you can just redraw the image based on your calculated locations of circles instead of reprocessing the old one.

You can play the same trick in k-space, which gives you a potent MRI reconstruction algorithm.

In practice, this doesn't work as well for more complicated images such as brains or abdomens. There is just much more to draw, and the images cease to be sparse.


>In practice, this doesn't work as well for more complicated images such as brains or abdomens. There is just much more to draw, and the images cease to be sparse.

Compressed sensing works just fine for brains, abdomens etc.. because the images need to be sparse in some basis (e.g. wavelets), not in the original space.


I haven't seen any good compressed sensing reconstruction for brains, though I'm not 100% up to date on the literature. Can you give me a cite?

The best I've seen is Candes/Donoho's reconstruction based on curvelets, but they don't offer any real improvement over regular reconstruction.


Check out Lustig's thesis. He finds that you can keep the same brain image quality while accelerating the sampling considerably.

http://yonit.stanford.edu/cgi-bin/axs/ax.pl?http://www-mrsrl...


Has someone been able to improve the (spatial or temporal) resolution of fMRI scans using this technique?


Thanks.


Seems really interesting, but I don't understand why you'd get rid of 90% of the image -- wouldn't it be much more useful to simply enhance the current images to 10x higher resolution?

That seems to be what the scientists are doing for the MRI, but the author seems to think this would be useful in that it would allow people to throw away a lot of data. I think it would be much better for extrapolating even farther from the data we have.


Here's a blog post by Terrence Tao on the subject: http://terrytao.wordpress.com/2007/04/13/compressed-sensing-...

One application he mentions is sensor networks that can withstand a loss of a large number of nodes.

As soon as I saw the article my first thought was: "real-time raytracing?"


In this case, you don't "get rid" of 90% of the image--you simply never acquire it in the first place.

The reason is that MRI sampling is prohibitively time-expensive: you need to keep the sample still for a long time to sample many points. In the article's case, that would require the patient to end up dead. ;) Sparsely sampling the volume takes much less time and allows for a reasonable reconstruction.

I'm not intimately familiar with MRI processing, but there may also be some intricacy of the imaging process (possibly related to the time-to-spatial-frequency conversion?) that makes the lower-res sampling especially suited for this algorithm.


It's more that you only need to collect 10% of the data to begin with, not necessarily throw away 90% of what you do have.


It turns out that the gains from only computing the paths of a subset of the rays are offset by the cost of recovering the compressively sensed data.

I could be wrong, though -- I only did this back-of-the-envelope a few months ago. I never really tried that hard to make it work.


(I'm assuming this is a reply to my raytracing comment)

This is most likely true. Additionally, raytracing just might produce samples that are too regular for this to work well. I haven't really thought about it :)


Regularity helps, not hinders, but it's surely not going to be real-time. It might be faster than ray tracing every pixel, but that depends on many factors that I don't know/understand.


(I think you meant to reply to a different comment.)


There's some confusion between "compressed sensing" and "L1-minimization" in the comments and the article.

L1 minimization here refers to trying to represent the image as a linear combination of basis functions a1 f1+a2 f2+ ... by minimizing objective function fit(a1,a2,...)+sum_i |a_i| where "fit" represents how closely the reconstruction fits the image (for instance, squared distance). The second term in the sum is the L1 norm of the parameter vector, and it has a special property of being non-differentiable at 0, and often causing individual components a_i to become exactly 0 during minimization, hence resulting in "sparse" solution vector.

Compressed sensing refers to the idea that when the image is sparse in the chosen basis {fi}, it's sufficient to take a small number of random measurements of the image. We can then compute "fit" approximately which gives us a different objective function that we minimize. The result will be close to what we'd get if we used the original (full) "fit" function




Cool. Didn't expect so much interest on HN for a highly mathematical concept.

If you're interested in a blog solely focussed on CS, here's one: http://nuit-blanche.blogspot.com/


gane5h, thanks for the plug :)

For compressed sensing to work when one needs to know in advance that the signal (image or anything else really) is sparse or compressible, Once this is known, one needs two findings that are just plain surprising and are the reason there is so much excitement in this area:

- the signal needs to be acquired "incoherently" and there is much work going on in that area, Most hardware is currently not designed to do this incoherent acquisition (this can be done cheaply in MRI which is why you see some applications there) and much work is currently going into designing new sensors that can performed this new kind of acquisition, Much of them are listed here: http://igorcarron.googlepages.com/compressedsensinghardware

- once it is "incoherently" acquired, then one can reconstruct the original signal using linear programming technique solving an l1 problem, But after four years of intense competition between different groups, we now have faster solvers that are approximately solving the l1 problem without linear programming or the lp (p between 0 and 1) and even L0 problems..

For more information, you may want to check the blog or the big picture site: http://igorcarron.googlepages.com/cs


There use to be more of them on the HN frontpage earlier on.


I might be mistaken (I'm not an expert), but isn't the article mixing up compressed sensing and low-rank matrix completion?

edit: Also, they present all of compressed sensing as if it were just that one algorithm for image reconstruction (which samples in the original space, rather than an incoherent basis). Tech media fail?


Compressed sensing and matrix completion are almost the same (take that with a large set of caveats, but the basic L1 recovery process is the same.) Tech media fail, anyway.


Why hasn't this made it into Photoshop or GIMP yet? What would it take to implement this to enhance images?


Gimp doesn't have this because in addition to the image, you need explicit knowledge of where the image came from.

Suppose you have an image which you already know is a cartoon. Compressed sensing can clean it up, but uses explicit knowledge of what cartoons look like. Similarly, if you know the image is scanned text, a different compressed sensing algorithm based on scanned text can clean it up.

But if you don't know where the image comes from to begin with, compressed sensing won't work.


This certainly could be done. Here are some MATLAB routines that implement it:

http://www.acm.caltech.edu/l1magic/

Seven examples using compressive sensing:

http://www.acm.caltech.edu/l1magic/downloads/l1magic.pdf


I'm curious about what are the consequences of the discovery.

What effects will this have on the storage of photos?

Can this principal be extrapolated to other forms of data?


Despite the article's somewhat confusing presentation, compressed sensing isn't primarily about photos. It's a principle that can be used to reconstruct any sort of signal from a small number of measurements. The only requirement is that we can find a representation for that signal that is sparse (every signal maps to some vector of mostly 0s). Luckily, most natural signals have this property (whereas random noise is not sparse in any representation).

The fundamental breakthrough here is that we can acquire signals (photos, recordings, MRI volumetric images, etc...) using a much smaller number of samples than previously thought possible. A useful application: let radiologists accurately image the body in less time. A weird application: Single-pixel cameras (http://physicsbuzz.physicscentral.com/2006/10/single-pixel-c...)


Thank you for the more in-depth explanation. The signal pixel camera is an interesting application.

I would imagine one of the most powerful applications would be compression of files. Might make things easy for big websites like facebook. Or allow zip to make files smaller. Or a flash card that hold more information.

I'd had to learn more about the technology. But EE standpoint there are two big questions:

How fast is the process? I'd imagine that it work fast enough for images, voice and but not fast enough for video. I'd imagine the speed will increase as people start playing with it.

And does the photo have to be compressed using the algorithm? The single pixel camera idea and many other signal capture operations really open up if you can just take a swab of the signal at random intervals. Perhaps a cool little program could be written to hunt for the necessary data points and capture them.

This technology will definitely bring improvement to many fields over time.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: