Hacker News new | past | comments | ask | show | jobs | submit login

I understand that hosted OCR, just like SaaS in general, is not suitable for every use case.

On the other hand, the OCR.space OCR API has a very strict privacy policy:

https://ocr.space/privacypolicy - All uploaded images and the extracted text are deleted immediatly after processing.




> All uploaded images and the extracted text are deleted immediately

Until they are served with a subpoena for a particular client, or a sweeping subpoena to store everything forever, or the company is sold and the new parent has different values, or the company decides to mine customer data for advertising uses, or there's a bug in the software, or there's a long-lived cache of the data, or it gets into their backups accidentally or deliberately, or they don't keep the data but keep "just" the meta-data, or they do statistics or analytics before deleting the data, or they are hacked, or they simply change their minds.

In terms of privacy, even a non-free non-open-source local app with DRM or license management is better than a server app with a "strict privacy policy". With a good firewall setup, you can be pretty sure that the local app won't betray you.


It doesn't seem reasonable to blame them for an arbitrary potential future when they're currently doing the right thing.


"The best way to avoid privacy breaches is not to formulate a detailed privacy policy; it's to reduce your capabilities so that you're unable to violate anyone's privacy."

http://www.daemonology.net/blog/2012-01-19-playing-chicken-w...


No, however the description of the plugin should make it clear data will be uploaded to a third party server for recognition so the user can make a choice about that.


It more or less does.

`For developers: Copyfish is published under the GPL open-source license. As OCR software, it uses the free OCR API from https://ocr.space/ .`


I don't find that clear at all. And this is also important to non-developers.

Also, for nearly all documents I ever need to scan, if they're important enough to require scanning, they're important enough that a third party should have nothing to do with them.

The majority of exceptions to the above being, ironically, documents without text, sketches, doodles, etc.


> when they're currently doing the right thing

You mean that we have to place some trust that they are. Some users cannot afford that kind of trust.


I suggest adding a big notification dialog that explains this when you first try to do an OCR request.


Why did you end up going with a .space domain? We blocked that whole TLD because we were getting massive amounts of spam from it when it first came out.


Oh dear. My main domain and email are in the .space TLD. I hope your practice is not widespread.

Personally, I chose .space simply because it's cool, cheap, and not overcrowded. It also seems to lend itself well to being part of a name.

I know spam is a hard problem, but I wish you wouldn't label me a spammer simply because of the TLD I chose.


That's one of the problems with cheap domains in the sub $5 range. Some gTLD registries (.space included) thought it was a good idea to offer them really cheap, but what they got were mostly spammers which puts you in a bad neighborhood.

There are a few others which you may want to avoid according to this report: https://securityintelligence.com/enticing-clicks-with-spam/


The author's full comment:

> Why did you end up going with a .space domain? We blocked that whole TLD because we were getting massive amounts of spam from it when it first came out.

From your comment:

> I know spam is a hard problem, but I wish you wouldn't label me a spammer simply because of the TLD I chose.

The author is not "labeling you a spammer". They're simply stating a fact about their experience. And in fact, it doesn't even mention you.


I'm not taking offence nor am I taking it personally. I was hoping my tone was clear on that (i.e. "I know [you have reason], but ..." and "I wish ...", which is just an expression of hope). Sorry, if it came off as aggressive.

I only tried to hightlight that they have, in effect, labeled everyone in .space (not just me, but me included) as a spammer.

It's heavy handed, but I understand there are sometimes pressing needs for quick solutions, like when having your mailboxes flooded with SPAM. Hence, the "I know ..." clause.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: