Here's my take on a solution to the problem of linkrot: https://www.purplerails....

vitovito · on Aug 31, 2014

Hi! Same thing here as with the other examples mentioned in this thread: this only helps you.

If you save a page but someone else needs it, they're out of luck.

But, if, in addition to making you a private, encrypted archive, they also tested to see if the URL was publicly visible and, if so, made a WARC of it, then they could package up all those WARCs for donation to the Internet Archive, and everyone could benefit.

purplerails · on Aug 31, 2014

> If you save a page but someone else needs it, they're out of luck.

There is a sharing feature to solve this problem. :)

But I agree with your point.

I actually looked into WARC earlier but didn't have the bandwidth to do it my first version. When I implement the ability to download your data, I'll try hard to use WARC. Unless there's some brain damage in the format: I hope not! :)

vitovito · on Aug 31, 2014

You have to save the WARC-required stuff on the initial capture, because it's a dump of the client/server conversation as well as the content. But thanks for thinking about it!

Here are some previous comments with links that might be useful:

https://news.ycombinator.com/item?id=6506032

https://news.ycombinator.com/item?id=6671152

carussell · on Sept 1, 2014

In Firefox 3, the default value for the lifetime of entries in the browser history was changed from 9 days to 99 days. In subsequent releases, it was changed to "indefinite, or whatever's reasonable, after applying some heuristics for the machine we're on".

A while back, I imagined bringing page content itself—and not just choice metadata like its URL and title—into the purview of the browser's history backend, too, effectively enabling WYGOIWYGO (What You Got Online Is What You Get Offline).

(I started off, funnily enough, not trying to imagine the next logical step in the "moar history" march, but instead with the Internet Archive in mind. I was trying to think of a way that would give ordinary plebs a zero-effort way to add to the Wayback Machine actual archived content in the same way that Alexa and the Internet Archive were slurping in data from the Alexa Toolbar about what pages are getting hits out there.)

After the stuff that happened last January with Aaron Swartz, I was even motivated to write up some use cases and gave it the codename "Permafrost":

Ashley just wants all the content he bookmarks (or simply accesses) to be always available to him, without being frustrated months or years from now by 404s, service shutdowns, and web spiders stretched too thin, allowing his favored content to slip through the cracks of their archiving efforts.

- < https://wiki.mozilla.org/Permafrost >

Even so, it remains one of those projects that I should really get around to kicking off someday, but may never end up starting, much less get close to "completing".

pain · on Aug 31, 2014

If I need to rely on your private domain to access my own research, how is it different or less risky than diigo etc? I'm waiting for an extension that lets me keep my own full data activity (and use your cloud too, optionally in addition).

purplerails · on Aug 31, 2014

Understood. Thx for the comment. The ability to download your data in a well-documented format (possibly WARC) is coming soon. I hope you will try out PurpleRails in the meanwhile. Thx again!