Recovering redacted information from pixelated videos

GistNoesis · on Jan 25, 2022

Even the best can make obvious mistakes.

"I hacked a hardware crypto wallet and recovered $2M [video]" https://news.ycombinator.com/item?id=30067340

Showing a blurry 16 out of 24 trezor wallet seed words https://youtu.be/dT9y-KQbqi4?t=1720

bufferoverflow · on Jan 25, 2022

This is somewhat bad. Trezor's dictionary is only 2048 words. So to brute force 8 remaining words would only take 2048^8 attempts. Which is 88 bits of security.

hwers · on Jan 25, 2022

That probably isn't very useful since the video isn't moving.

mckirk · on Jan 25, 2022

With a redaction as bad as a Gaussian blur at such a high resolution (and a dictionary of only 2048 candidates), you don't need multiple angles to decipher those words.

It'd be great if they replaced the key with bogus words first before blurring to troll people, but somehow I doubt it.

GranPC · on Jan 25, 2022

> It'd be great if they replaced the key with bogus words first before blurring to troll people, but somehow I doubt it.

This is what I do 99% of the time I use blur to censor information; just replace the text with, say, colorful words of the same length before blurring. Would be neat if there was an automated tool that could do something similar.

Pxtl · on Jan 25, 2022

The article references the famous face-depixellating PULSE GAN.

Let's remember that PULSE also gave us Barry O'Bama.

https://www.google.com/amp/s/www.theverge.com/platform/amp/2...

pc86 · on Jan 25, 2022

Non-AMP link: https://www.theverge.com/21298762/face-depixelizer-ai-machin...

godelski · on Feb 2, 2022

I'm not sure why PULSE is being called out here. This is extremely common on any super-resolution trained on extremely biased datasets like FFHQ. In fact, this discussion is something that led to the dataset being changed. The authors actually extended their research to discuss the bias[0]. I'll also link to the Reddit discussion[1]. IMO the authors here responded to this correctly. It is important to remember that algorithms are only good on their in distribution datasets. And kudos on the authors for doing more experiments and including work on a less racially and sexually biased dataset.

[0] section 6 https://arxiv.org/pdf/2003.03808.pdf

[1] https://www.reddit.com/r/MachineLearning/comments/hk2ryn/d_h...

rndgermandude · on Jan 25, 2022

Indeed, however one has to note that this article is mainly about pixelated text, which is a bit different, especially considering that it (usually) has a known alphabet and you often even know what the individual font glyphs look like from context.

hwers · on Jan 25, 2022

You definitely shouldn't be downvoted, this is a very valid point.

Pxtl · on Jan 25, 2022

I assume I got down-voted by people cranky about AMP links.

rndgermandude · on Jan 25, 2022

I basically only saw "Barry O'Bama" at first, and thought "oh good, another rightwingey troll/rant". I managed to pick up on the PULSE GAN before my mouse made it over to the downvote button, tho, so it wasn't me. I think you'd be better off just adding half a sentence of tl;dr like "... Barry O'Bama, where PULSE GAN incorrectly reconstructed a pixelated image of Barack Obama, resulting in a white guy that didn't look anything like the original" or something like that.

0des · on Jan 25, 2022

How can we avoid comments like these in the future?

RunSet · on Jan 25, 2022

Yet again mosaic blur's inferiority to opaque rectangle is demonstrated.

Rygian · on Jan 25, 2022

I recently made use of my right to access personal information from someone who had sent me unsolicited marketing material.

The person who fulfilled my request sent me a PDF copy of their full customer list, where all entries had been blacked out except mine.

As you may anticipate, that blacking out was simply a black box drawn on top of the actual data. It took me all of 3 seconds to select all, copy, then paste in a word processor to confirm that the file contained personal data from hundreds of other individuals.

thrashh · on Jan 25, 2022

About 10 years ago or something, someone in the US government made this mistake and it was all over the news.

That’s when a redaction tool suddenly appeared in Adobe Acrobat.

rzzzt · on Jan 25, 2022

I thought at that point they started to print out the redacted versions and scanned them again.

danuker · on Jan 25, 2022

Congrats! You now have a data leak to report to the GDPR authority of your state, in addition to the unsolicited marketing.

steerablesafe · on Jan 25, 2022

More generally if you want to redact pixels, then don't replace them with information that depend on those pixel values. It doesn't necessarily mean a black rectangle, but that's certainly simple and it works.

mannykannot · on Jan 25, 2022

I’m slightly curious as to how, in these cases, someone came to decide that blurring was the right way to do it. It seems unlikely that they never thought of simply blacking/ blanking the text, but if so, then blurring must have seemed preferable.

My best guess is that, having seen blurring of faces (which is arguably OK when one merely wants to avoid casual attempts at identification, while retaining a ‘natural’ look), they assumed this was the proper way to do it in all cases.

kqr · on Jan 25, 2022

Retaining a "natural" look is phrased well. I think that's all behind the decision to infrasample (opposite of supersample?) parts of an image.

anon_123g987 · on Jan 25, 2022

The opposite of supersample would be subsample.

MauranKilom · on Jan 25, 2022

Or just "undersample".

jerf · on Jan 25, 2022

I think it leaves the visual interest behind. Black bars are both boring and jarring. It's aesthetically displeasing on many levels.

Unfortunately, part of "leaving the visual interest behind" is precisely "pixels that depend on their real underlying values"....

josefx · on Jan 25, 2022

You could use filler text like lorem ipsum to keep a natural appearance without exposing any information. Of course that is a bit more work than just dropping a blur effect on an existing document and exporting it as PDF.

Scoundreller · on Jan 25, 2022

Half of the fun of redacted rectangles is figuring out which letters/words can fit into that rectangle. Movable type makes things even easier.

I once read a gov document that opaque squared the pronouns, but it was clearly about a she/her!

MarkLowenstein · on Jan 26, 2022

Have you ever hand-written a word but want to hide it? One cross-out line doesn't do the trick. In fact, a full minute of trying to hide it with an ugly darkened-in box usually doesn't even do it. But if you just write a couple random letters over each existing one, people have basically no ability to recognize your original word.

That is my one insight. Take it for what it's worth.

Natsu · on Jan 25, 2022

Nah, better to replace it with a mosaic blur of unrelated text.

So when they unblur it to hack you they find a rude messages instead of the actual text.

sumtechguy · on Jan 25, 2022

Is anyone using this sort of idea on upscaling videos?

archi42 · on Jan 25, 2022

AI upscaling? Yes. Seems like some fans used AI upscaling for the older Star Gate SG-1 seasons, since these were direct-to-VHS and hence shot on low quality media (by now there are official bluray releases which are also somehow upscaled, but I was told those would be of inferior quality compared to the fan effort). Not sure if those efforts worked on a frame-by-frame basis or used information across frames.

[edit] here is a link: https://www.youtube.com/watch?v=cDss3BfsITs [/edit]

josefx · on Jan 25, 2022

The remastered edition of Command and Conquer also used AI up scaling for the cut scenes. They lost the original recordings and the videos from the PlayStation release were the best they could track down. The result is far from perfect but probably the best one could hope for https://www.youtube.com/watch?v=ikJLYYTrIxs&t=689s.

I previously replied to the wrong comment on accident.

fy20 · on Jan 25, 2022

And to think we all laughed at CSI enhancing CCTV images...

josefx · on Jan 25, 2022

Luckily we can track the amount of lawsuits filled against Ryan Gosling to see if the police started to submit AI up scaled images as evidence. https://petapixel.com/2020/08/17/gigapixel-ai-accidentally-a...

Syzygies · on Jan 25, 2022

As University libraries have moved online, one reads many poorly scanned journal articles. I often wonder about taking the time to clean them up. What replaces temporal information here is the same characters appearing over and over.

So of course I read this article hoping to learn about an off-the-shelf tool that would do a great job of scanned text reconstruction. Alas, the best candidates were "no code available."

throwawayciv651 · on Jan 25, 2022

Not off the shelf but here are some tools. I have no experience with them.

Wolf binarization - I think it makes the text more clear before OCR.

https://github.com/chriswolfvision/local_adaptive_binarizati...

This thing OCRs the pdf using Tesseract OCR

https://github.com/ocrmypdf/OCRmyPDF/

Two other pdf tools

https://github.com/qpdf/qpdf

https://github.com/pikepdf/pikepdf

Syzygies · on Jan 25, 2022

I'll play with the Wolf filter, thanks.

Math typesetting is too messy for current OCR tools. It would be nice to reverse-engineer the LaTeX source for a math paper, but not likely soon. OCR for the language would help in mind-mapping a web connecting my saved papers, but I wouldn't use it for reading.

I want everything to look like a 600dpi scan mixed down, as I would make, rather than what the libraries thought would be acceptable. For the pure joy of reading.

The easiest approach that might work would be language agnostic, understanding only what clean scans of characters look like. Can we back-solve a clean scan from a lower resolution mess, matching up similar characters in the text without identifying the characters?

Somehow I imagine this is a giant singular value problem. I'm ok if it takes a day to run per paper, I have spare machines.

throwawayciv651 · on Jan 25, 2022

Found these 2 for math LaTeX OCR:

https://github.com/lukas-blecher/LaTeX-OCR

https://github.com/harvardnlp/im2markup

Also some LaTeX editors:

LyX https://en.wikipedia.org/wiki/LyX

TeXstudio https://en.wikipedia.org/wiki/TeXstudio

GNU TeXmacs https://en.wikipedia.org/wiki/GNU_TeXmacs

rasz · on Jan 26, 2022

I imagine you could go really far with a dumb approach or matching against pre existing font database.

twotwotwo · on Jan 25, 2022

You should be able to do better than just aligning and averaging frames. (Edit: looks like MauranKilom knows what they're talking about here, and expresses in their comment it clearer than I could.)

Imagine you were running averages on successive windows of a 1D array--when the average changes, that tells you the difference between the values that entered your window and the ones that just left. That's information about a sliver of data much smaller than the overall window. It's weirder with 2D and random-ish movement, but if your average (pixelation) filter is moving across text due to camera wobble or such, when the average goes up and down tells you something about where edges are in the content underneath.

I'm butchering the words because this isn't my thing, but this feels like it might be related to some actual signal-processing task (i.e. undoing some kind of signal-mangling that happens in the wild) which increases the chance that there's some good or at least well-studied solution.

The brute-force-ish approach for text reconstruction would also probably more effective if it checked against a few shifted-around blurred copies of the text, rather than just one.

slingnow · on Jan 25, 2022

Funny how the whole article talks about this approach, and then at the end shows the approach failing in the real world. I don't know about you, but I can't conclusively come up with a license plate in that final video.

MauranKilom · on Jan 25, 2022

Sure, but the technique used was also very trivial. Just aligning and averaging all the video frames basically leaves a mosaic-pixel-sized blur on everything (assuming the camera movement is uncorrelated with the mosaic grid).

You can get much further by applying deconvolutions and using more math. I've been meaning to put some time into this myself but never got it off the ground.

I wonder if the author would be open to making e.g. the car data available?

breakingsystems · on Jan 25, 2022

> I wonder if the author would be open to making e.g. the car data available?

Here you go:

- The stabilized frames out of blender (with 4 blurry frames in a separate folder): https://breaking.systems/plate_frames_sorted.zip

- The original video in case you'd like to improve the stabilization as well: https://breaking.systems/plate_vid_orig.mp4

Would love to hear back in case you'll tackle it!

MauranKilom · on Jan 25, 2022

First challenge is going to be figuring out the grid alignment in the stabilized frames. But I have a decent idea how to tackle that, which I'll hopefully get to tomorrow!

HanClinto · on Jan 27, 2022

I'm interested in following along with your progress!

Grustaf · on Jan 25, 2022

Yes, I've also always felt there must be ways to extract more data from a moving clip, precisely because of the effect he explains, but then it seems that just superimposing the images doesn't actually extract that information, at least not all of it.

But I wonder how to actually do it, do you have concrete ideas for a simple algorithm?

MauranKilom · on Jan 25, 2022

If you can figure out, for each frame, which sets of (pre-aligned) pixels have been averaged, you can create a large system of equations that captures those relations and solve it to find the unblurred pixel values.

Depending on camera movement (and whether you might get "ground truth" information from pixels entering and leaving the areas near the borders) the system will be more or less well-conditioned. I'm going to try this for the data the author graciously provided and report back!

abbeyj · on Jan 25, 2022

There are some (fairly old) papers that might contain some useful ideas for you:

- http://www.eyetap.org/papers/docs/mann94virtual.pdf - http://wearcam.org/orbits/index.html

I seem to recall that there used to be a video showing this approach in action. As input it took a video panning across a shelf full of books where the resolution was so low that the titles were illegible. And as output it produced a video with higher resolution and all the titles easily readable. Unfortunately I can't find that video any longer.

MauranKilom · on Jan 25, 2022

Yes, it all boils down to point spread functions. In the mosaic case, the PSF varies locally (per pixel) and temporally (in different video frames). The paper you link similarly details how they figure out the PSF. You can theoretically also do the entire thing without knowing the PSF, which is called blind deconvolution: https://en.wikipedia.org/wiki/Blind_deconvolution

scoot · on Jan 25, 2022

> I wonder if the author would be open to making e.g. the car data available?

Interesting - this is same incorrect use of e.g. that the author made in a couple of places. Contrary to (apparently popular) belief, "i.e." and "e.g." can't simply be used as direct replacements for their English equivalents.

"e.g." is used to introduce one or more examples that satisfy a previously provided general form, for example:

I prefer fruit, e.g. apples or pears, over vegetables. Apples and pears being examples of fruit, not that an example is needed in this case, but for the sake of simplicity.

In the former example, "the car data" is not an example of "making".

"i.e." follows a similar rule. If there are exceptions for either, I'd be interested to know of them.

MauranKilom · on Jan 26, 2022

Interesting, thanks for bringing this to my attention! Do you have any reference that explains this rule? I noticed on Wikipedia that introducing multiple examples used to actually have a different abbreviation (ee.g. or ee.gg.), so clearly something is already lost in translation here...

In any case, what I wrote is really just a shorthand for more cumbersome formulations (like "...open to making your data, e.g. the car [data], available?" - that would hopefully be correct?), and reducing text is the whole point of using an abbreviation in the first place. But I'm open to striving for more consistent usage, so if you can refer me to some kind of authority on how to mix Latin abbreviations with English text, I'd be curious about it!

pjc50 · on Jan 25, 2022

Hmm. I've been wondering about the motion-deblur approach for a while, for use in cleaning up VHS / youtube quality videos. Might even be able to get a head start given that h264 contains a certain amount of inferred motion information anyway.

mleonhard · on Jan 26, 2022

It seems that we could increase camera resolution by putting the sensor on a vibrating platform, capturing a stream of frames, and processing them into a single image. The paper mentions Google camera software doing this with hand tremor. Is there any instance of intentionally shaking a camera to increase resolution like this?

I predict that future super high-resolution camera rigs will be whirling contraptions, spinning in 3 dimensions to improve 3D resolution. And the best still camera will be a wand (linear sensor array) on an articulated head that moves like a chicken's head, capturing during movement. The sound of a camera will be whoosh instead of click.

brickers · on Jan 26, 2022

I believe Pentax have a system like this in some DSLR models

WalterBright · on Jan 25, 2022

When I want to redact something on a photo, I just draw a rectangle over it, and fill it with the background color. There's no recovering from that.

pessimizer · on Jan 25, 2022

Unless a thumbnail was embedded in the exif data, and wasn't updated in the software you used to draw the rectangle.

codespin · on Jan 25, 2022

yet ;)

odyssey7 · on Jan 25, 2022

Any relationship between optical character recognition and "deblurring" images into characters or words that the system recognizes?

Someone · on Jan 25, 2022

OCR algorithms typically aren’t targeting heavily blurred text. It’s more about handling all the ways letters can look, including ligatures, determining paragraph breaks, detecting tables, ignoring staples and coffee stains, etc. than about correcting for bad scans/photos.

NikolaeVarius · on Jan 25, 2022

So much brain power only to be used to de-blur japanese p0rn

jabbany · on Jan 25, 2022

Hah, I know you're joking here but this is (was) a thing: https://www.reddit.com/r/programming/comments/9sc0qj/deepcre...

djokkataja · on Jan 25, 2022

It's actually mentioned in the original article:

> Side note: The potentially most extensive research on the problem of programmatically unblurring mosaic'ed regions from videos was done by Japanese Adult Video enthusiasts. Javplayer automatically detects blurred regions and performs upscaling via TecoGAN, and another person spent months improving their custom GAN that was trained with leaked videos (search for "De-Mosaic JAV with AI, Deep Learning and Adversarial Networks").

kauguste281 · on Jan 25, 2022

Deblurred porn (aka "Decensored") has left the realm of jokes a while ago, that stuff is already out in the wild on your average porn site.

15characterslon · on Jan 25, 2022

Gotta know exactly what is behind those mosaic pixels.

kwertyoowiyop · on Jan 25, 2022

Because there’s no other way for us to find out!

hungryforcodes · on Jan 25, 2022

wooo!

hungryforcodes · on Jan 25, 2022

Caught all the pedants :P