Hacker News new | past | comments | ask | show | jobs | submit login

Well, there is one thing "terribly damning," and that's the "Deleted Wall Posts" bit.

This was also discovered when someone in Europe requested all of the data that Facebook has on him: that stuff which he had deleted had never actually been deleted, but was instead stored with a flag "deleted=True" which was checked by the database queries.

In principle, at least for short time scales, this is not a bad idea -- you should try to make most or all of your database manipulations reversible, just in case someone steals someone else's account (or other similar abuses). But for long time scales, you would really expect that it would eventually get purged -- and as far as anybody knows, it never is.




You can't purge backups: if the servers were allowed to go in and remotely modify the off-site backups to purge deleted data, then an intruder could do the same thing for all data. So you can't permanently delete data in the backups. When subpoenaed, Facebook would have to go back through the backups and find the user-deleted data anyway: they still have it under their control. So there's really no difference in marking a "deleted" flag or purging the data from the production servers: they are not (and should not, and arguably cannot be) able to the data from backups.

This is true for all web services. I'm not even getting into about the support and user anger cost by permanently deleting user data. But it's clear to see that this — while an intuitive idea — is not practically possible for most web services.


I would agree that there are privacy concerns whenever you back up user data. It's something of an interesting question because users could also be held responsible for creating their own backups, especially if you made it extremely easy for them to do. It's something like when my VPS failed. The code that I was running on my VPS I routinely backed up, but the database -- while I serialized it to disk -- wasn't ever transferred to my local computer because I didn't think it was important. When the VPS failed, the whole database was gone. The point is, I can't be angry at them for not backing it up, because that wasn't a term of the service they were providing me. A similar mental model could probably work for database-driven sites, at least for the databases storing user content -- your code should of course always be under a mirrored version control. ^_^;;


You should still delete it from the main database, even if you defer it for a week via deleted=1 for performance reasons.

And then it's easy to purge ALL user data backups more than X months old.


I thought that was always how it's done? As a side project I'm making a public forum where people can delete each other's posts (it's mostly an experiment) and I'm still keeping the data that's deleted, just not showing it. I didn't think companies ever really drop entries from their database when users "delete" something.


Like I said, normal practice is to make "delete" reversible, which means that you must add something as opposed to subtracting something. But in principle there is probably some duty of privacy that when you say "I've deleted X," then you actually are intending to delete X. Anything else would seem problematic.


What are you hoping will come of that experiment? Seems unruly.


Well there's a bit more to it - it's an imitation of the poster walls you see in coffee shops and around schools. Posts take up space on a limited area, so in order to make a post you have to find an empty spot or cover up someone else's post. You can also take anything down, so it's community moderated. I want to see how civilized people can be expected to act anonymously. I'll probably share it on HN if I get any users :P


How do you think you're going to get users? Share it now :)


Yeah, that seems like fun. I'll definitely want to check it out.


If they purged them, they wouldn't be able to supply them to law enforcement.


Correct. (And it is the contrapositive of this statement which allows us to know that they did not purge them.) But they are not required to collect arbitrary data for later supplying to law enforcement -- law enforcement can subpoena you when it has a reason, but if you don't have information then they can't demand it of you.

If it helps, this is why the various alternatives to Google are able to announce that they don't collect user data: they have no obligation under US law to be collecting such information, so, they don't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: