There is also a user-valuable reason to not do hard deletes. Doing a soft delete...

pinusc · on April 28, 2020

You can always hard delete all the data _and_ keep track of deleted users so that their usernames can't be reused.

Once you have hard delete, this solution is almost trivial and by far the most user-valuable.

DevX101 · on April 28, 2020

> keep track of deleted users so that their usernames can't be reused

This seems to violate GPDR, no? Attacker attempts to create an account (say: victim@gmail.com) on AshleyMadison and is prevented because the server tracked past users. Attacker could them demonstrate victim@gmail.com was at one point a user on AshleyMadison.com

ecnahc515 · on April 28, 2020

As others have mentioned, that's an issue already. The solution is to never acknowledge if a user does or doesn't exist on register/sign-up/forgot-password pages and simply state that instructions have been emailed to you in all cases. The key is that you don't act differently if the user does or doesn't exist.

cutemonster · on April 29, 2020

If they don't get a verification / instructions email, they'll know an account with the username they typed, existed?

ecnahc515 · on May 12, 2020

In this case, where you're probing for user names or emails, you don't own the email, so you wouldn't receive the verification yourself, and thus wouldn't know if the account exists.

This is exactly why most password reset emails say "if you didn't request this, please let us know, as someone may be attempting to access your account".

lytefm · on May 1, 2020

You shouldn't use usernames in that scenario, just emails. After Signup, you just show a general message that a confirmation Email has been sent. If the account already exists, some policy to notify the account owner can be put in place.

lostapathy · on April 28, 2020

Verifying the email keeps someone from hijacking the account without leaking that an account formerly existed. At least so long as their email isn't also compromised - in which case they have bigger problems.

RandomBacon · on April 28, 2020

That's not much different than not being able to create an account with victim@gmail.com because victim@gmail.com already has an account. Both instance leak information

crdrost · on April 28, 2020

You don't have to track their emails unless you are reusing emails as usernames. Just tracking the username suffices.

This is also one of those situations where people often put too much shit in the user table. "we have to delete the user row" -- I mean, you have to delete some of the user row, yes.

I like to solve this by proper namespacing. Suppose you instead deliberately have an authUser table which just has what you need for auth -- a UUID to hook into the rest of the system, salts and passwords for direct logins, maybe a nullable date "banned_until" if you want banning; assuming you use crypto bearer tokens rather than an auth tokens table then you also want a column with a date date for "tokens last reset on"; etc. You can put the username in there just fine, that's needed for auth. Maybe you let people log in with email+password and thus you also put their email address in there, also fine.

As long as the authUser table does not grow to encompass all of your other business logic you are good. Other tables foreign key to authUser and you delete rows from them and that doesn't upset the foreign key. You leave the row in authUser to indicate that the username is taken.

An additional "deleted" field on authUser can be used to block logins and thus the username is taken but they can't log in. As for the email address, even if you insist on a UNIQUE and NOT NULL constraint for it (and I would find this surprising in an age where we log in a lot with social media) you can auto purge by setting it to CONCAT(id, "@purged.example") and then you have a valid email address which is nowhere else used in your auth flow, no personally-identifiable information at all. Heck then you don't even need the boolean flag if you would rather forbid the .example TLD from logging in.

So that has worked well for me in the past and it seems to solve those sorts of problems with only a little tweak. The key is that the PII need is to delete the "user row" but that does not have to be the authUser row -- if you separate the two rows out then you can leave the authUser row while still having a table appUser which lives in your application and contains all the cool stuff about this user using that app. It also naturally lends itself to you thinking about a sort of SSO for all of your different applications up-front.

bryanrasmussen · on April 28, 2020

the real GDPR problem is if the user has asked to delete data and you do this soft delete but keep all their old data as well, and then someone hacks your system and gets that data.

You're obligated by GDPR to disclose to affected parties that their data has been compromised, but you were also obligated to delete the data by GDPR.

abiogenesis · on April 28, 2020

Renaming your username to username-1000 has the exact same side effect though.

mcv · on April 28, 2020

Yeah, that is about the worst possible way to do it. If you can't do a hard delete for whatever reason, the right way to do it is to set a flag that prevents any activity on that account. They can keep the name in order to prevent anyone else from stealing it, but still delete all the profile data attached to the account.