Hacker News new | past | comments | ask | show | jobs | submit | freosam's comments login

As others have mentioned, Transkribus works pretty well for handwritten text recognition. You can also train your own model if you have enough source material.

If the documents you have are able to be made public, you could upload them to Wikimedia Commons and use https://ocr.wmcloud.org/ — you can use Transkribus via that. (Disclosure: I'm an engineer working on the Wikimedia OCR project.)


I'm also interested in this. I've been using SpiderOak for years, but am currently trying to migrate away (to rsync.net, coincidentally). It's not that I've ever had any issues with SpiderOak, but nor do they seem to be a very engaged company (e.g. I've never heard of a SpiderOak person posting here on HN, but @rsync is never far away and is always friendly). It does sound like their efforts are in other directions.


There are a few OSM apps that make it easy to share tracks. I use OsmAnd, it allows easy recording, viewing, and uploading to OSM. OSMTracker probably has better battery life, maybe.


I agree, Overland is good.

I recently added support for it to my blog (which uses twyne.rtfd.io) so I could more easily geolocate photos (from my non-GPS camera), and it was pretty easy to integrate. I tried GPSLogger too, but found that battery life was much better with Overland, and it also has a better system of queuing points when offline (with GPSLogger I found that it lost data at times such as when there was an internet connection but the server wasn't responding).


Me too. And at the same time it sets a "non-us" cookie, so maybe it's some sort of geolocation silliness.

Anyway, here's an archived version: https://web.archive.org/web/20211102003457/https://www.nbcne...


One of the indieweb approaches to feeds is to just structure the HTML sufficiently, and not have separate feed files. This works pretty well, and some feed readers work with it. Some info is at https://indieweb.org/h-entry


> This boils down the difference between pigment and dye based inks. Dye based inks are more expensive but more resistant to UV light.

I think it might be the other way around: pigment-based inks are opaque particles bonded to the paper, and so even if their colour changes/fades they're likely to still be legible. Whereas dye-based inks, when they fade, can completely disappear. (One advantage of dye-based inks is that they have a larger gamut.)


That is interesting!

I wonder what the "100 year plan" will be. Flickr does seem like one of the few places that could conceivably store copyright photos for as long as it takes for them to become public domain (i.e. it's worth it, because they keep the full-res originals and lots of metadata).

Of course, the vast majority of photos on Flickr could probably be made CC-BY-SA now by their owners (i.e. everyone who's not making money out of their photography). I think more should be done to encourage people to do that.


Flickr has set up a non-profit "to properly preserve and care for the Flickr Commons archive, support Commons members […], and plan for the very long-term health and longevity of the entire Flickr collection."


Whitworth's method of scraping (mentioned in this article) is explained in this 1858 paper:

https://en.wikisource.org/wiki/Miscellaneous_Papers_on_Mecha...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: