Hacker News new | past | comments | ask | show | jobs | submit login

Linkrot is a real problem. Especially for those sites that disappear before the archive can get to them.

On another note, the more dynamic the web becomes the harder it will be to archive so if you think that the 1994 content is a problem wait until you live in 2040 and you want to read some pages from 2017.




Turns out the solution to every stack overflow post will be

"JavaScript is required"


Content from Stack Overflow has higher odds to survive than this, they've uploaded a data dump of all user-contributed data to archive.org: https://archive.org/details/stackexchange. It's all plaintext. This is really generous of Stack Exchange and shows they care for the long-term.


I assume the anonymisation is just on votes? It doesn't seem 100% clear at first glance.


That's actually one of the reasons all my personal stuff gets built as HTML/CSS, then just use Javascript for quality of life stuff (image lightboxes that work without putting #target in browser history, auto-loading a higher-res image -that sort of thing).

I know I won't be maintaining it forever, but I want it to be accessible through the archive.


Server-side rendering please save us.


Well, there's now Chrome headless which is slowly edging out PhantomJS for such use cases.


Are you suggesting running an entire web browser on the server-side just to render each client request to HTML?

That VC money must really be flowing...


This article was posted on HN a while back suggesting exactly that: https://hackernoon.com/leaner-alternatives-to-server-side-re...

It's just as ridiculous as it sounds.


No. It's only needed to playback archives of web pages that only work with old JavaScript libraries enabled.

But then again there's WebRecorder for exactly that.

https://webrecorder.io/


It's actually fairly easy to record web sites despite how dynamic they are; all you have to do is save the response data of each XHR (and similar requests) and the rest of the state (cookies, urls, date/time, localStorage, etc).

For even more accuracy save a Chromium binary of the version at the time so it'll look exactly as intended.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: