Hacker News new | past | comments | ask | show | jobs | submit login

Don't do this. If it's a clean print-out it's trivial to OCR at a five nines accuracy rate.



Depending on font he used, I believe for OCR some characters might be difficult:

o01|IO

And even one wrong character makes SSL key wrong.


There's OCR systems which work on the basis of internal font consistency. They break the page into a series of single character images, and because the same character repeated is close to identical it's trivial to match them up, so you can easily build a map of characters.

You then just need a human to label each character once. With a pixel image comparison 0 looks completely different from o.

If they're using a standard font then a regular OCR (you'd only need four nines accuracy to get it 100% correct) would be fine, even with a weird font it still be easy to get that level of accuracy.


The obvious solution to this would be to cycle randomly between fonts every few characters (or keep a running total of the font used for each particular symbol, and ensure it stays below some threshold). For bonus points you could convert the key from base64/ascii to unicode or similar.


The obvious solution to this would be to cycle randomly between fonts every few characters (or keep a running total of the font used for each particular symbol, and ensure it stays below some threshold).

This sounds like a useful defence in general against OCR re-use of particular things you might publish. I wonder if it could be done in a manner unobtrusive to the eye, but progressively more expensive to algorithms, either in terms of memory or time. This is really a neat idea you have.


In some fonts a big i and a small L look the same.


Doesn't matter. A few wrong characters can easily be brute-forced. Once you have enough of the characters, you can just write a program to try modifying a few of them until you get a key that works.


Somehow I think the founder of an ultra-secure email provider knows this.


You would 100% accuracy though. One char difference would render the key useless.


Not really, you've leaked so much of the key that doing exhaustive search of possible OCR errors becomes trivial.


Even at 4 point font (as mentioned else where) ?


4pt on a 600dpi printer is about 35 dots per character, should be more than enough to get a clean read.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: