Hacker News new | past | comments | ask | show | jobs | submit login

...I guess the enhance jokes were just rendered void!

https://www.youtube.com/watch?v=Vxq9yj2pVWk




Absolutely not. If there's not enough information available, there's not enough information, full stop.

Plausible (i.e. "good looking" or "believable") results are not the same as actual data, which is why enhance wouldn't work on vehicle licence plates or faces for example.

Sure, the result might be a plausible looking face or text, but it's still not a valid representation of what was originally captured. That's the danger with using such methods for extracting actual information - it looks fine and is suitable for decorative purposes, but nothing else.


No there certainly is a chance for ML to improve here.

Let’s take the classic example of enhancing a blurry photo to get a license plate.

Humans may not be able to see much in the blur, but an AI trained on many different highly down-res’d images could at least give you plausible outcomes using far less data than a human brain would be able to say anything with confidence.

You wouldn’t hold it up as the absolute truth, but you’d run the potential plate and see if it matched some other data you have.

So yes, it wouldn’t magically add any more information to the image, but it could be far better at taking low information and giving plausible outcomes that are then necessary to verify.


> Let’s take the classic example of enhancing a blurry photo to get a license plate.

That's not the same as fabricating information, though. A blurry image still contains a whole bunch of information and correlation data that just isn't present in a handful of pixels.

This is not super-resolution, but something different entirely. Super-resolution would mean to produce a readable license plate from just a handful of pixels. That is an impossible task, since the pixels alone would necessarily match more than one plate.

The algorithm would therefore have to "guess" and the result will match something that is has been trained on (read: plausible), but by no means the correct one, no matter how many checks you run on a database.

To illustrate the point, I took an image of a random license plate, and scaled it down to 12x6 pixels. 4x super-resolution would bring it to 48x24 pixels and should produce perfectly readable results.

Here's how it looks (original, down-scaled to 48x24, and down-scaled to 12x6 pixels): https://pasteboard.co/JSu3WDU.png

The 48x24 pixel version could easily be upscaled to even make the state perfectly readable. A 4x super-resolution upscale of the 12x6 version, however, would be doomed to fail no matter what.

That's what I'm getting at.

Just for shits and giggles, here's the AI 4x super-resolution result: https://pasteboard.co/JSu7jkP.png

Edit: while I'm having fun with super-resolution, here's the upscaled result from the 48x24 pixel version: https://pasteboard.co/JSu9Qh6.png


I was simply pointing out that AI enhancement to find details otherwise not possible by humans that could be useful/accurate is very possible, and I don’t think you refuted it.


I also never denied that. But there's a difference between finding details that would otherwise go amiss (i.e. in lieu of a microscope revealing features invisible to the unaided eye) and reproducing data that simply isn't there to begin with (as is the core of the "enhance"-trope).


Actually the opposite. These algorithms are more susceptible to noise, they may generate sharp perfect license plate numbers (that are totally fabricated and completely wrong) from a blurry image. But by no means should you even consider the results to have hints of truth.

GAN produces totally different results if you slightly change the input.

So, as others are also saying, these "enhances" are great for decoration and absolutely should be ignored as facts or truth (specially when it comes to face and license plate and others used by the law enforcement).


>These algorithms are more susceptible to noise, they may generate sharp perfect license plate numbers (that are totally fabricated and completely wrong) from a blurry image.

This is not really an issue that is new or limited to things that are called AI.

https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...


No. If their training set isn’t too far off what you use it for, it is valid. Just because it’s not guaranteed, doesn’t mean it’s not more accurate than hitting sharpen and squinting.

You’re fighting against “would it be reliable” but that isn’t the claim.

The claim is could it be better than human, and the answer is yes, it just depends on how well trained it is and the dataset.

But this is also entirely testable. I guarantee much like Go, if we set up a “human vs AI guess the blurry image” competition that AI will blow us out of the water. It’s simply a data * training issue, and humans don’t spend hours on end practicing enhancing images like they do playing Chess.

Again - it won’t be perfect, obviously. It will have false positives, of course.

Doesn’t mean it can’t be better than human.

Also GANs are pretty irrelevant, the model structure has nothing to do with the theory.


That's a fair point. Police use artist sketches from witness descriptions to help identify a suspect. It's a similar idea.

The difficulty will be making sure people treat it the same way, because it looks like a normal image.


> The difficulty will be making sure people treat it the same way, because it looks like a normal image.

Hmm. Mandating the use of style transfer to make it look like an artist's rendering would probably have the intended effect.


Well just because that would be a bad dangerous idea does not mean that police will not do it. After all police uses lie detectors, fingerprinting and DNA evidence without much care for an error rate.

I hope some day there will be an episode of a crime show where by chance two teams of detectives will independently work on the same case without noticing each other and by using standard police methods they will come to completely different incompatible conclusions and detain two different suspects who of course both confess after going through standard police interrogation. Screenwriters, use your powers for good!

(Actually Czech writer Karel Čapek (the same who invented the word robot) did practically the same thing in one of the short stories in Stories from Another Pocket, everybody should read it together with Stories from a Pocket)


By the way here is an enhanced version of the same supercut from the same person:

https://www.youtube.com/watch?v=LhF_56SxrGk


CSI and the like were just ahead of their time

edit: I was joking, but people pointing out that you still can't create something out of nothing etc might not be thinking big enough. I think this technology absolutely has the potential to help. police are literally still using artists impressions - photofits, to find perpetrators


I think the artist impression has a lot more value than a highly realistic generated face. If you see an artistic impression, you will see the facial features that were noticeable. Such as a mole, the shape of the nose, or the thickness of the eyebrows. Then you have a template that your brain uses to match those features with any face that you see. However, if I show you a highly realistic face, your brain will take a different impression. Your brain is trained on faces for thousands of years. It will try to match the face perfectly.

An artist impression tells the audience that it is inaccurate. A realistic photo tells the audience that this is _exactly_ who we are looking for.


Yep. To be useful for exploring potential "true" values, a system would probably need some way of showing you the distribution of its guesses, so you can get an idea of whether there is any significant information there.

That aside, you'd still probably need a ML PhD to have a chance of correctly interpreting the results, given the myriad potential issues with current systems.


This demo is fun to see the ML take different guesses at how to inpaint the missing data: http://www.chuanxiaz.com/project/pluralistic/


I’m not sure if that’s the case with this tech. I could see in the near future a scenario in which many, many individuals (thousands) are photographing the same things in the same area and you can intelligently superimpose things to “enhance”, tho.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: