Hacker News new | past | comments | ask | show | jobs | submit login

I'm not fundamentally misunderstanding anything. We both understand the problem quite well, you're just refusing to take on the burden of trying to solve the problem.

I didn't suggest not storing any logging data, actually. I suggested deleting it. Old, stale logs are unnecessary for operation uptime or debugging and low-value for usability or security investigation.

They also cumulatively presents risks to customers. A gay blogger in Russia who used LiveJournal in 2004 might regret their decision now, even though in 2007 when LiveJournal sold to a Russian company, few reasonable people would have foreseen the country's turn towards homophobia later. If LiveJournal had, for example, replaced all IP addresses with cities in historical, pre-2007 HTTP logs, they would have lost nothing of value to them while their customers would be that much safer. If they had gone so far as aggregated statistics of requests and unique visitors per tuple of (user agent, city, timestamp truncated to 15-minute intervals), and then deleted detailed all HTTP logs older than 90 days as suggested by Maciej Ceglowski [1], can you think of anything of value they would have lost?

But of course I'm sure they didn't, because they weren't required to put that much thought into the data they stored.

I'm saying we should be required to.

[1]: https://idlewords.com/talks/haunted_by_data.htm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: