Hacker News new | past | comments | ask | show | jobs | submit login

Am I the only one who redacts info, prints it out, then scans it back in? Or redacts, then takes a screenshot before sending out?

For some reason I just never trust the PDF tool (or human error on my end) actually redacting the info, even if I were to do a print to PDF.




Nope. That's called rebroadcast. It's also used to try to "launder" photo manipulations, like compositing. I helped work on some algorithms which could pick up artifacts even after rebroadcast.

I would absolutely not trust pdf not to leak metadata. Although now you risk metadata leak from the printer or scanner, which may or may not affect your threat model.


Careful what printer you use to print it out, some of them add patterns of dots that can uniquely identify the print: https://en.wikipedia.org/wiki/Machine_Identification_Code


When a coworker asked me for my recommended method of creating and publicly sharing redacted copies of documents which (in their unredacted forms) contained PII for children, I told them to do this, in no uncertain terms.


> Am I the only one who redacts info, prints it out, then scans it back in?

if you have the source document, redacting from the source (by actually removing and replacing with an appropriate placeholder, not obscuring, the content) and regenerate the static (e.g., PDF) version.

If you are working from print, I think scan and redact by digital replacement (not overlay or otherwise obscure) would be sufficient. Redact->print->scan probably helps somewhat (especially if the scan is low quality) if you are using a bad redaction method to start with, but why do that?


I do same, except scanning, why not just print it to PDF?


Because some tools might still put a text-layer under the printed so you can select text and copy.


Not if there is a rasterization step in the process. That's essentially what printing and scanning achieves, rasterization, and we can do that without the printer and scanner.

Of course, the artifacts introduced by printing and scanning (especially with contrast turned way up) gives it an air of legitimacy, although these can also be simulated.


If you print to paper and scan you are mostly safe, but if you do a software print to a pdf document you might use a tool that saves the actual content as invisible text or the whole word document as an attachment to the pdf. I would print and scan physically if it was something important. Or just edit the word document to remove the stuff and then print and scan to avoid saving the edit history since I don't know if that will be saved somewhere.

Usually I'm in full control of the software myself so I just output X instead of the secret data.


> I would print and scan physically

This degrades quality and wastes paper and toner. There are software tools to convert PDF to raster graphics.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: