While the author(s) are still alive, they are often a productive contact. (In on...

mrybczyn · on Dec 9, 2022

Good (Edit) point! It's good that the web accepts dead links by design, we can't expect perfection from our distributed information, but it seems that the bitrot of information is too high compared to the information storage technologies available.

2 spinning rust drive can store the library of congress. ~ 2,000 drives would store the web (1). How many millions of these drives get manufactured per year? Our technology systems are failing us - all those words are being lost, like tears in rain.

(1)Back of the envelope estimation: https://www.worldwidewebsize.com/ ~ estimates 50 billion websites, with some estimates ~ 6 pages of information per website. Let's say 1mb per page on average. So ~2,000 drives would store the entire web.

8bitsrule · on Dec 9, 2022

2000 drives x $100 = $200,000. Double that for backup, $400,000. Admin, maintenance, let's say total, $1M/year. So, Wikipedia could end its own deadlink problem (IF the reference sources would agree.

But stuff goes missing at Wayback because people don't agree to their pages being backed-up. Copyright, whatever. So it's like Global Heating, the tech is there, but people just can't agree. So 'pirate' backer-uppers go to jail. And island-nations and expensive ocean-side properties are being submerged. So it goes.

toomuchtodo · on Dec 9, 2022

The Internet Archive has a bot that updates dead Wikipedia references to point to archived content.

https://meta.wikimedia.org/wiki/InternetArchiveBot

brewdad · on Dec 9, 2022

Even if that estimate is off by an order of magnitude, which given the weight of modern web pages it easily could be, 20,000 drives to store the entire web seems way more doable than I ever would have imagined.