The web has nothing built-in for archiving and versioning. It's a gaping hole in...

JasonFruit · on April 3, 2019

'Archival' is an adjective. The gerund 'Archiving' has done good service as a noun for decades. (I will die on this hill.)

gambler · on April 3, 2019

Fair enough. Updated.

int_19h · on April 3, 2019

> Who archives the archives, etc.

Another archive:

https://www.bibalex.org/en/project/details?documentid=283

There are also partial distributed backups by volunteers:

http://iabak.archiveteam.org/

AgentME · on April 6, 2019

I'm interested in IPFS for allowing sites to be archivable. If a site I was interested in was hosted with IPFS, then I could mirror their content and help serve them their content on IPFS. If the original host goes down, I'll still be able to help host the content at its original URL, and the URL will still work for anyone else in the world who tries to follow it. And then maybe people will re-host my own content in the same way, even to long after I'm gone if my content is good enough.

bo1024 · on April 3, 2019

Interesting post. So are you saying that if there was a good standard for versioned web archives, then you could stop maintaining your website and just point people to the archives?

nikisweeting · on April 4, 2019

Yup, that's the idea behind projects like https://github.com/HelloZeroNet/ZeroNet and https://github.com/oduwsdl/ipwb.

gambler · on April 3, 2019

I would still maintain the website, but it would be much easier, because I could lean on archival features when that makes sense, instead of trying to keep everything "stable" manually.

sergiotapia · on April 3, 2019

Nobody could have predicted the global growth of internet users and the sheer quantities of data being created on a per second scale. Exabytes when? And then what?

gambler · on April 3, 2019

There were plenty of people predicting it, pointing out its deficiencies and explaining what needs to be done. In terms of very high level ideas, Alan Kay comes to mind.

mmsimanga · on April 3, 2019

Let me guess, you didn't provide references because the sites predicting the growth of the internet were not archived? :-)

jdougan · on April 5, 2019

Ted Nelson was complaining constantly about the deficiencies. Pretty much nailed the issues too. Unfortunately his solutions were difficult to implement.

nikisweeting · on April 4, 2019

If you want a non-centralized solution check out https://archivebox.io or https://github.com/webrecorder/pywb.

(also there is a standard for web archives: WARC)