As a metrologist (and photographer), the difficulty with these techniques is tha...

roughly · on March 13, 2021

I think this point is worth pushing on a bit harder, which is to say that the "additional details" in the picture are guesses by the software, not actual additional details. The data present in the picture is fixed, the software uses that data to build educated guesses on what was actually there. If the photo doesn't contain enough data to actually determine what a given piece of text in the image says, the software can provide a guess, but it's just that, a guess. Similarly, if the photo doesn't provide enough detail to positively identify a person, the "super resolution" one cannot be used to positively identify them either, as it's a guess made from incomplete data, not genuinely new data.

The point is worth belaboring because people have a tendency to take the output from these systems as Truth, and while they can be interesting and useful, they should not be used for things for which the truth has consequences without understanding their limitations.

You're right to compare this to how our brains reconstruct our own memories, and the implications that has for eyewitness testimony should inform how we consider the outputs from these systems.

TimTheTinker · on March 13, 2021

This “guessing” is nice for the sake of artistry, but we’ve got to be careful when knowing what actually was there is important—like when photos are submitted as evidence in court cases, or when determining the identity of a person from a photo as part of an investigation.

smnrchrds · on March 13, 2021

I hope such photos are submitted as camera takes them. With our without this new feature, photoshopping a photo before presenting it to court must be illegal.

roughly · on March 13, 2021

If you consider photos taken by cell phones, it's hard to really say what "as the camera takes them" means - a lot of ML-driven retouching happens "automagically" with most modern cell phones already and I'd expect more in the future.

kqr · on March 13, 2021

It goes even further than that. Image sensors don't capture images. They record electricity that can be interpreted as an image.

This might seem like a quibble, but once you dive a little deeper into it, you realise that there's enormous latitude and subjectivity in the way you do that interpretation.

What's even crazier is that this didn't come with digital photography. Analogue film photography has the same problem. The silver on the film doesn't become an image until it's interpreted by someone in the darkroom.

There is no such thing as an objective photograph. It's always a subjective interpretation of an ambiguous record.

powersnail · on March 14, 2021

There is a difference in the degree of subjectivity. In interpreting electricity, it's highly localized, and probably doesn't affect the macro structure of the image.

With ML-enhanced photos, you might have a distanced face that is "enhanced" by the model, to become a face that wasn't there. Or a fingerprint, a birthmark, a mole, etc.

klodolph · on March 13, 2021

Analog photography you could at least use E-6. Processing was tightly controlled and standardized, and once processed, you had an image.

The nice thing about this was that you could hand the E-6 off to a magazine and end up with a photograph printed in the magazine that was very close to the original film. Any color shifts or changes in contrast you could see just with your eyes. You could drop the film in a scanner and visually confirm that the scan looks identical to the original. (You cannot do this with C-41.)

This was not used for forensic photography, though. The point of using E-6 was for the photographer to make artistic decisions and capture them on film, so they can get back to taking photos. My understanding is that crime scene photography was largely C-41, once it was relatively cheap.

dr_zoidberg · on March 14, 2021

I'm surprised nobody cited yet the Gigapixel AI-"Ryan Gosling" case from last year: https://petapixel.com/2020/08/17/gigapixel-ai-accidentally-a...

rozab · on March 13, 2021

In some use cases, like OCR, the accuracy of these guesses can be established in a scientific way. And it tends to be very good.

roughly · on March 13, 2021

I agree; I'd say two things in response, though:

1. However good the guess is, it's still just that: a guess. Taking the standard of "evidence in a murder case", the OCR can and probably should be used to point investigators in the right direction so they can go and collect more data, but it should not be considered sufficient as evidence itself.

2. OCR is a relatively constrained solution space - success in those conditions doesn't mean the same level of accuracy can or will be reached outside of that constrained space.

To be clear, though - I'm making a primarily epistemic argument, not one based on utility. There are a lot of areas for which these kind of machine guessing systems are of enormous utility, we just shouldn't confuse what they're doing with actual data collection.

sanj · on March 13, 2021

Unless they’re not: https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres...

reanimus · on March 14, 2021

Did you read that article? That wasn't an OCR issue it was an image compression issue.

sanj · on March 14, 2021

I did, and I’m aware it wasn’t OCR that was the underlying problem.

But the issue manifests as characters being incorrectly identified because of an algo t

viraptor · on March 14, 2021

Same thing in a way. OCR does lossy compression from pixels to text. Both could do similar mistake for pretty similar reasons.

iujjkfjdkkdkf · on March 14, 2021

I'm not sure about the OCR example, but there are information / sampling theory limits on what can be discerned in an image, based on sampling rate (pixels basically) and optics. Any extrapolation outside these limits is proveably guessing.

Edit - re OCR do you mean e.g. from a picture of a blurred license plate we could rule in or out a subset of possible numbers, depending on how blurred, like a B could be a 8 but not a L? (And sorry if your example is unrelated). This is valid, and unrelated to super resolution, you can do this analysis with Nyquist and point spread functions.

anigbrowl · on March 14, 2021

I think this point is worth pushing on a bit harder, which is to say that the "additional details" in the picture are guesses by the software, not actual additional details.

I don't. Everyone knows this already and it seems like a lot of people are just saying it over and over to look clever.

TeMPOraL · on March 13, 2021

What worries me is that COTS photo equipment increasingly comes with these algorithmic retouches that "over-represent" the data - or, put another way, bake its own interpretation into image, in a way that cannot be distinguished from source data.

It's nice for a casual Instagrammer, but then a lot of science and engineering also gets done using COTS equipment. I worry that at some point, a lot of money will be burned, a lot of time wasted, or even lives lost, because someone didn't notice they've based the conclusions of their scientific experiment/engineering analysis on such "computer best guesses". As a researcher, you'll see a weird pattern on some of the photos and will be left wondering, is that a real phenomenon, or is it just one of the black box, trade secret neural networks in the camera choking on input data it wasn't trained for?

Baeocystin · on March 14, 2021

It has already happened. Back in 2013, Xerox had a copier that changed numbers, for similar reasons.

https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...

Sebb767 · on March 14, 2021

That was due to compression, however, not upscaling. Related, but not similar :)

herendin2 · on March 14, 2021

Upscaling could be described as the decompression stage of a lossy compression algorithm

Sebb767 · on March 15, 2021

A compression algorithm knows which data was lost and can optimize the discarded data for a good lost data/saved space ratio. Data "lost" by a low resolution sensor most definitely does not fit this description. Imagine saving a FullHD png instead of a 4k jpg - the former is most likely far worse.

It's not too dissimilar, I agree, but there are differences.

neogodless · on March 14, 2021

I did a web search for "cots" and learned that a cot is...

> a small usually collapsible bed often of fabric stretched on a frame

But in this case, COTS is apparently...

> commercial, off-the-shelf

In other words "photo equipment" or "consumer/retail photo equipment."

Upon further reading[0] it seems odd to use the term here, but maybe I'm misunderstanding something. It's often used for software and has a key phrase...

> packaged solutions which are then adapted to satisfy the needs of the purchasing organization

But it's possible the term has been co-opted to mean something else now.

[0] https://en.wikipedia.org/wiki/Commercial_off-the-shelf

TeMPOraL · on March 14, 2021

I use it in a way it's used in disciplines that also work with specialty-built, or even custom-built equipment. Such as science, military and some types of engineering (e.g. aerospace). The first sentence of the linked article describes it:

"Commercial off-the-shelf or commercially available off-the-shelf[1] (COTS) products are packaged solutions[buzzword] which are then adapted to satisfy the needs of the purchasing organization, rather than the commissioning of custom-made, or bespoke, solutions."

So for example, a research team may decide to not spend money on expensive scientific cameras for monitoring experiment, and instead opt to buy an expensive - but still much cheaper - DSLR sold to photographers, or strap a couple of iPhones 15 they found in the drawer (it's the future, they're all using iPhones 17, which is two generations behind the newest one). That's using COTS equipment. COTS is typically sold to less sophisticated users, but is often useful for less sophisticated needs of more sophisticated users too. But if COTS cameras start to accrue built-in algorithms that literally fake data, it may be a while before such researchers realize they're looking at photos where most of the pixels don't correspond to observable reality, in a complicated way they didn't expect.

webmaven · on March 16, 2021

What you're describing is edging close to the setup for the 2003 novel 'Blind Lake': https://en.m.wikipedia.org/wiki/Blind_Lake_(novel)

In the novel, quantum computers (rather than ML per-se) are tasked with interpolating more and more detailed data from astronomical observations, to the point that tracking individual members of an alien species on a distant world, underground, is possible. Eventually it is noticed that cutting off the astronomical data entirely doesn't interrupt the interpolated data. Then things get weird.

I won't go into further plot details, as that would be spoilery, but it is a pretty good book, reminiscent to me of Greg Egan's oeuvre (the novel is actually by Robert Charles Wilson).

aidenn0 · on March 14, 2021

COTS is the tech equivalent of fashions "off the rack" as compared to bespoke.

aksss · on March 14, 2021

It’s a common acronym in the tech world. I’ve usually used it in the context of a “buy-or-build” conversation about software (e.g. “most businesses are best off using COTS applications than doing custom development” - that sort of thing). But the acronym means what it means, so when OP talks about COTS camera gear, it makes sense to me.

brmgb · on March 14, 2021

> "buy-or-build"

As an aside, the term of art is "make-or-buy" if you want to be able to Google it.

The discussion we are having is interesting because COTS are notorious for their hidden costs and how difficult they are to properly budget. Having to find a way to disable or reverse advance post-processing in a camera would be a fairly typical example of that. In this specific case it might mean having to commission a custom firmware from the camera manufacturer - something which is very much doable but might end up costing you as much as buying bespoke equipments for inferior results in the end.

aksss · on March 14, 2021

Interesting, in software world I’ve always heard/used build/buy rather than make/buy, and I’m guessing that comes from construction industry as a lot of traditional software PM methodology was inspired by that world. If you Google ‘build vs buy’[0] (no quotes) all your top results are software discussions. If you Google ‘build or buy’, it’s all about housing. [1]

Make-or-buy seems more a term for manufacturing industry/SCM. TIL

[0] https://www.google.com/search?q=build+vs+buy

[1] https://www.google.com/search?q=build+or+buy

cannabis_sam · on March 14, 2021

I did the same web search (although in all caps) and was immediately pointed to “Commercial of-the-shelf”, and I redid it now in incognito mode over a VPN, and the first answer is still “ Commercial-off-the-shelf” (with an added hyphen probably due to the language where the VPN endpoint is located.)

coldtea · on March 14, 2021

>It's nice for a casual Instagrammer, but then a lot of science and engineering also gets done using COTS equipment. I worry that at some point, a lot of money will be burned, a lot of time wasted, or even lives lost, because someone didn't notice they've based the conclusions of their scientific experiment/engineering analysis on such "computer best guesses".

Most research papers are crap anyway, in a much more fundamental way and for much worse reasons/bad incentives with far more impact than "computational imaging".

This is probably the last thing I'd worry about when thinking about "millions/time/wasted" for some research.

kamel3d · on March 14, 2021

Well just don't use it

jrockway · on March 13, 2021

I think this is probably good for what people use photos for; it lets them show a crop without the image looking pixelated. That means if they just want a photo to draw you in to their blog post, they don't have to take a perfect photograph with the right lens and right composition at the right time. And I think that's fine. No new information is created by ML upscaling, but it will look just good enough to fade into the background.

I personally take a lot of high resolution art photos. One that is deeply in my memory is a picture I took of the Manhattan bridge from the Brooklyn side with a 4x5 camera. I can get out the negative and view it under magnification and read the street signs across the river. (I would link you, but Google downrez'd all my photos, so the negatives are all I have.) ML upscaling probably won't let you do that, but on the other hand, it's probably pointless. It's not something that has a commercial use, it's just neat. If you want to know what the street signs on the FDR say, you can just look at Google Street View.

(OK, maybe it does have some value. I used to work in an office that had pictures blown up to room-size used as wallpaper in conference rooms. It looked great, and satisfied my desire to get close and see every detail. But, you know you're taking that kind of picture in advance, and you use the right tools. You can rent a digital medium format camera. You can use film and get it drum scanned. But, for people that just need a picture for an article, fake upscaling is probably good enough. The picture isn't an art exhibit, or an attempt to collect visual data. It's just something to draw you into the article in the 3 milliseconds before you see a wall of text and bounce.)

bscphil · on March 13, 2021

> Google downrez'd all my photos, so the negatives are all I have

Wow, Google ate your one digital copy? That's tragic.

What's the approximate resolution you could get out of a scan from these labs?

I was interested in getting into film cameras at one point, and I was disappointed with how low the scanning resolution is from most labs. For example mpix only advertises 18MB, which they say is good enough for a 12in by 18in print. North Coast Photo (what Ken Rockwell recommends) is even worse! What if you want something to put on the wall?

Granted, if the original film you shoot is perfect you can have a print done from the negatives, but that kind of defeats the point of having a high quality scan as a backup.

jrockway · on March 13, 2021

Yeah, paying people to scan your photos doesn't yield good results. I did that early on and found it expensive and low quality. Honestly, the process they use doesn't scale well, and I think they offer film scans to be nice, rather than because it's a viable business.

With my home setup, I can easily do 50-80 megapixels on a 4x5 negative. I use a flatbed photo scanner (the Epson V800) and wet-mount the film. It is Quite The Process involving a lot of parts (liquids, optical film to place on top of the mount, careful calibration of the focus point, etc.) but the results are excellent and relatively repeatable. But, all in, I'd estimate that it's probably a half hour of labor per photo, so you can see why labs charge so much. (Dry mounting doesn't save that much time, because of the amount of time you spend avoiding dust and optical artifacts intrinsic in using two extra sheets of glass.)

The real professionals use drum scanners. They are quite expensive, but offer incredibly high resolution and decent throughput for the operator. Looking around at prices, for $100 you can get a 320MP scan of a 4x5 negative yielding a 1.7GB file. http://www.drumscanning.com/rates.html For fine grained black and white films, you can certainly extract information that actually exists. As someone who mostly uses T-Max 400, though, that would be overkill. I can't imagine getting much more information out of my photos than I get with a flatbed.

In summary, you can see why even pixel peepers are content with their Sony A7R. Press button, get 50 megapixels. And no toxic chemicals being absorbed through your skin.

bscphil · on March 14, 2021

Wow, I didn't realize you did your own scans. That's very interesting and cool, thanks for the information about it.

Looks like you can get those scanners used for pretty reasonable prices. Maybe if I've got a house one day and I think the odds of having to move within a few years are low I'll get into it and try setting up a lab.

> In summary, you can see why even pixel peepers are content with their Sony A7R. Press button, get 50 megapixels. And no toxic chemicals being absorbed through your skin.

Yep, and on top of that we're not limited by the sRGB gamut or bit depth issues of early digital cameras. Recent ones produce raw files that are extremely easy to develop and manipulate into something very nice looking.

The thing is, even on top of the enjoyment some people get out of working with film, if you're after a particular film-like look you might be able to save yourself a significant amount of post-processing time by just going with film. I've seen no one-click filter that can approximate it.

sweezyjeezy · on March 13, 2021

If you're using this to try and enhance super grainy CCTV footage to get a face or license plate I'd agree. Purely in the context of this article, the author is just upscaling an already high-definition image 2x. There's very little artifice that can be really added at this level that a human could perceive IMO.

utborin · on March 13, 2021

> These aren't so different from our own brains, which remember what we thought we saw, rather than the light that reached our retinas.

Never mind memories; there are parts of our eyes that aren’t responsive to light at all. We’re always hallucinating.

webmaven · on March 16, 2021

> Never mind memories; there are parts of our eyes that aren’t responsive to light at all. We’re always hallucinating.

Are you referring to the blind spot, or something else?

Interestingly, the blind spot turns out not to be a design requirement, it is a contingent feature that cephalopods like octopuses (whose eyes evolved independently from vertebrates') don't have.

lumost · on March 14, 2021

I take a lot of pictures of mountain scenery. I blew up one of these that had an interesting composition consisting of rocks/fields/mountain peaks. On inspection the resulting image had substantially changed the composition by increasing the size of a field relative to all other objects.

It’s much easier for the model to blow up uninteresting pieces of the photograph than interesting pieces.

dorkwood · on March 13, 2021

An example I saw getting traction on Twitter a few months ago was a photo of Melania Trump that was purported to be a body double. Since the original image was blurry, someone used an AI upscaler to "enhance" the photograph and increase the resolution. Then the comments started to roll in: the teeth are different! The tip of her nose doesn't match! It's not her!

Technically, they were correct -- it wasn't her. It was an algorithm's best-guess reconstruction based on training data of other people's faces. Unfortunately, neither the original poster or anyone else in the thread seemed to grasp this concept.

atlantis_ · on March 15, 2021

I have been using neural-enhance (gh:alexj) and Topaz tools to upscale PAL/NTSC artworks the last three years and would not be so judicious in describing these tools. They are hallucinating what the model assumes an upscaled image should look like and not enhancing in any way as the word is understood. The original image ceases to exist. A more honest term might be “Render As Upscaled” or “Generate Higher Resolution Image” (likewise “ML”, not “AI”).

When playing around funny things happen too: recursive upscale/sharpen and analogue artifacts begin resembling topography, molten metal etc.

Now imagine this being fitted into military drones, which it almost certainly is.

newobj · on March 13, 2021

This is perceptual/creative enhance. Not Blade Runner enhance.

prox · on March 13, 2021

Would it be right to say it is an synthesis on top of a analysis? It wasn’t what was observed. For some things it might not matter, but “it looks shopped” isn’t really a positive in my book. Although the use case in the article is pretty handy, to print stuff a lot larger.

social_quotient · on March 13, 2021

It’s like systems that add zeros (floating point math) to the right of the decimal. 18.00000 is not the same as 18.

Thoughts?

etrautmann · on March 13, 2021

No - 000000 is not based on a statistical model of what’s most likely to be right if the decimal place. In natural images - the statistical structure allows for this image upscaling but without revealing any previously hidden detail - just using know statistics of the world to show what might be there.

gmfawcett · on March 13, 2021

That's not how floating point math works? At least not for standard floats (IEEE 754), and except for very large integers (near 2^m, where m is the mantissa of the FP type). Floats have an exact representation for integers within their mantissa range -- i.e., '18' is exactly the same as '18.0000'.

Izkata · on March 14, 2021

They're correct when it comes to scientific fields - the number of significant figures is important, so 18.0 and 18.000 really do mean different things.

gmfawcett · on March 14, 2021

Sure, of course... I guess they just gave a poor example of what they were trying to express.

genericone · on March 13, 2021

I dont think you mean for floating point, but for mechanical tolerances. Many times, you dont want to pay an extra $50,000 for the 5 digits of precision... but sometimes you do. Shitty system if it automatically messed up all your part tolerances.

dejj · on March 13, 2021

You probably want go with 0.1+0.2 != 0.3

HenryBemis · on March 13, 2021

I would say that it's like pixel's RGB at address 1x1 is 0-0-0 and pixel at address 1x2 is 0-0-2 and squeezing between them a pixel with color 0-0-1 (averaging the two values near it)(assuming doing this on a image that has 1 pixel height and e.g. 2 pixes width; so that the new image would be would be:

1x1 0-0-0 (original)

1x2 0-0-1 (made up)

1x3 0-0-2 (original)

webmaven · on March 17, 2021

What you're describing is relatively straightforward (bi)linear interpolation. It is worth noting that even at this relatively simple level, going with bicubic interpolation instead will usually give you nicer results, except in cases where the hard edges in the image are only horizontal or vertical.