Instead of deleting account, NYT appends ‘1000’ to username and email address

mistyq · on April 28, 2020

How was it possible to discover it, though?

samthecoy · on April 28, 2020

From the twitter comments: https://twitter.com/bicycult/status/1255122953798328320

They were still logged in and refreshed the page; they found out by going to their user settings.

jasonbarone · on April 28, 2020

I've seen this same method used on multiples apps I've requested an account deletion on. It's super frustrating. Most companies either don't respond back, say they deleted it when they merely disabled it, or they updated the account name to something else.

three_seagrass · on April 28, 2020

Disabling is understandable, because as a company you need a record of transactions or interactions such as TOS agreements for legal purposes. This requires keeping the records.

Changing object name data though is a terrible practice to implement this.

malinens · on April 29, 2020

actually GDPR forces You to have option to delete all data and not just d8sable user

ptman · on April 29, 2020

It's a bit more complicated than that. You have a period of time to delete the data. And you can keep enough info to know what data you've deleted, so that if you restore from backups you are able to re-delete without having to go through your backups and delete everything. Probably more that I'm forgetting.

three_seagrass · on April 29, 2020

You can keep contact info even after a GDPR deletion request, so long as you're not using it for business purposes.

Otherwise imagine how easy it would be to violate the deletion request if you're running a business and can't remember the names of the people you had deletion requests for. Their data could come up again through normal channels and you'd treat them no differently than another sales contact, thus violating GDPR.

Asuchug4 · on April 29, 2020

Depends on type of data and other laws. If you are a paying customer you can assume your data will be stay in database, until it is no longer required for audits. GDPR allows for anything that is 'absolutely totally required for providing service'.

speleding · on April 29, 2020

In most countries you are required to keep payment records for tax purposes for 5 years or more. As this is a business necessity this trumps the GDPR. And since most business involves some kind of payment it's likely most businesses will not actually fully delete the information they have on file for you.

KeepFlying · on April 28, 2020

I noticed a similar thing being done for Bird scooters a while back. I forget the suffix but they did the same and I noticed because I was still authed on my phone after requesting deletion. My token has expired since then though so for all I know they have fully deleted the account since.

ashtonkem · on April 28, 2020

Having known people who worked at Bird, I doubt it. They had a real culture of “hack it up and then move on”, going back and fixing stuff like that isn’t in their culture.

berkes · on April 29, 2020

I once received an automated email to 'deleted@example.com'. Where example.com is my domain. I employ catchall, so that I can generate a new mail for each service. I contacted them and they apologised. The CTO personally explained that this was legacy they lost track of and thanked me for pointing out.

You catch a lot, with catchall: dataleaks, hacks, sneaky data sales etc. When suddenly you recieve, say, marketing mail for shirts on 'jeansonline@example.com' something fishy is going down.

HenryBemis · on April 29, 2020

I have been doing the same since 2001, the amount of crap the net catches is unfathomable. Apparently there is a company somewhere on this planet with the same name as my last name "bemis.com" while my domain is "bemis.net" and I sometimes get invoices, CVs, PowerPoint presentations, emails from their external auditors.. I am waiting for the day they will ask me to buy my domain (which I use long before their company was created)(oh and it is my last name.. so good luck with that).

Having done enough security audits, "this was legacy" is a BS excuse. I will go ahead and assume that NYT have an audit department. And that audit dept runs throug the full audit universe every 4-5 years. Someone would have captured that a long time ago (1st, 2nd, 3rd lines)(external auditors)(any sales pitch: "we have 438264728 subscribers")

I call BS. They got busted and now they most likely change this from 1000 to 2000 and call it a day..

Ps: bemis is not my real last name.. but I am using a super cool name over here!!

gramakri · on April 28, 2020

Catch all addresses? If you self-host email (we do), you are able to catch email with typos etc with a catch all address

DevX101 · on April 28, 2020

Doing real deletes on user accounts is a surprisingly challenging problem and I'd be willing to bet very few companies do real deletes where all of your data is wiped permanently from the company. For legal and financial reasons, companies often need to keep track of historical user activity. If a company states in their investor quarterly report that they had 1M active users, they better be able to prove it in an audit.

And in a naive relational database implementation, deleting a user would cascade and delete activity associated with that user.

The easiest way around this is to do soft deletes where the data stays in the db, but the flag deactivates the user's account. Looks like the NYT just did a poor implementation of a soft-delete.

perl4ever · on April 28, 2020

I used to work where (not a service for the general public) there was an "is deleted" flag for everything, but every now and then a client would insist that data be really deleted, and depending on who it was and how they asked, we might go and do it, which was a huge hassle and would cause no end of problems down the line.

On the other hand, "is deleted" flags end up causing issues when you forget to put "where not is_deleted" in your queries.

Lately I've faced kind of an inverse situation - I have a system that I can't control where things are permanently deleted once in a while for multiple reasons (rogue users, aging out of old versions) and so as I accumulate information in a little data warehouse for reporting, I decided to implement an "is deleted" flag there. Eventually though, deleting from the source was turned off because it's really not necessary.

Terr_ · on April 28, 2020

I also worked at a similar place, and fantasized about rewriting everything so that soft-deletion wasn't a per-row technical detail, but was instead an explicit modeled business-flow. Perhaps stored as a flag on an Aggregate Root (like "Customer" or "Project") at a much coarser level of detail. Sure, your queries still need to account for it, but at least you don't have a potential patchwork of inconsistent flags.

P.S.: Random advice to anybody working on enterprisey stuff:

1. "Deletion" is too vague and broad. I strongly suggest you call it "deactivation" and some other word like "purging."

2. Deactivation is typically what a company actually wants, even if they don't know to ask for it. By phrasing it that way you also encourage stakeholders to think about "reactivation" before it becomes an architectural problem.

3. True purging is rare, and tends to be related to either disk-space issues or legal requirements. In the latter case, you'll want an audit-trail or tombstone of some sort, meaning it's still a real workflow and not just an easy SQL DELETE statement or something.

lostapathy · on April 28, 2020

The discard gem for rails has put a lot of thought into this and works pretty well, all things considered.

It's not perfect, but I find using a library (either directly for inspiration) is often a shortcut to learning about a lot of edge cases from doing your own things.

https://github.com/jhawthorn/discard

dorgo · on April 28, 2020

>On the other hand, "is deleted" flags end up causing issues when you forget to put "where not is_deleted" in your queries.

My solution would be a view for every table. Are there drawbacks? Other solutions?

perl4ever · on April 28, 2020

Eh, the situation that I've been in, if I recall correctly, is that one has read/write access to all data for reporting, but not the ability to create views (or stored procedures etc) to share.

Where I am now, (as far as Oracle goes) you can't even create your own tables under your own schema. A view requires a meeting with a DBA and their manager and really special, compelling arguments.

lilyball · on April 28, 2020

Can you just have a general policy where every table that has an is_deleted flag also automatically gets a view, and clients must use the view unless they have a particular need to access deleted data?

capableweb · on April 28, 2020

Not sure about drawbacks but another solution would be to change the point where you do the queries instead. So you have a "user" object that is largely saved the same as most other "crud" objects, so you have a layer for those, add the flag there.

dbbk · on April 29, 2020

Most ORMs have built-in support for this. For example Laravel's Eloquent automatically filters out `deleted_at` records

latch · on April 28, 2020

You can use row level security so that forgetting the filter isn't an issue.

dorgo · on April 28, 2020

as far as I understand this would only work for users who don't have access to "deleted" rows. Are access rules a good way to handle this? Serious question. My solution would be to have views as "guards" for every table.

latch · on April 28, 2020

Views also work.

I don't understand what you mean re "access". RLS is what removes access to those rows, that's the point of RLS.

In postgresql, anyways, superusers aren't subject to RLS and the table owner, by default, isn't either. But RLS can be enforced for the table owner by a single alter statement.

mekster · on April 29, 2020

> would cause no end of problems down the line

That's just how badly the system is designed. Write a script to delete all relevant data and the user and it would be a simple op.

> forget to put "where not is_deleted"

Use views.

netsharc · on April 28, 2020

I've been on both sides of this insisting, if a company annoyed me too much (e.g. headhunters mailing too frequently) I'd drop the "data privacy laws" (nowadays GDPR) bomb and ask for my data to be deleted.

On the other side, a customer got really pissed off by an online shop we maintained for a client, and asked for his data to be annihilated, we thought "What a douche.".

hinkley · on April 28, 2020

And something I had to learn the hard way and then teach quite a few people is that hard deletes don’t just turn your tables into Swiss cheese, they also can cause table scans.

When you delete a row, every inbound foreign key constraint has to be checked to look for any rows that refer to the deleted row, and most likely you didn’t set up an index for the foreign key, so now you have a table scan. Possibly several.

It’s not that much more work these days to set up a partial index on the table instead and add another WHERE clause, you have smaller problems with people accidentally deleting the wrong thing, and you’ve started down the path to audit trails.

nikisweeting · on April 28, 2020

It's been standard practice at all the companies I've worked at to index all foreign key fields, no matter what they're used for, I've yet to run into a situation where it's been more harmful than helpful, but these companies all had <10TB data in SQL so idk if it's good general advice.

munificent · on April 28, 2020

There is also a user-valuable reason to not do hard deletes. Doing a soft delete prevents another malicious user from immediately reclaiming your now-available ID and pretending to be you.

pinusc · on April 28, 2020

You can always hard delete all the data _and_ keep track of deleted users so that their usernames can't be reused.

Once you have hard delete, this solution is almost trivial and by far the most user-valuable.

DevX101 · on April 28, 2020

> keep track of deleted users so that their usernames can't be reused

This seems to violate GPDR, no? Attacker attempts to create an account (say: victim@gmail.com) on AshleyMadison and is prevented because the server tracked past users. Attacker could them demonstrate victim@gmail.com was at one point a user on AshleyMadison.com

ecnahc515 · on April 28, 2020

As others have mentioned, that's an issue already. The solution is to never acknowledge if a user does or doesn't exist on register/sign-up/forgot-password pages and simply state that instructions have been emailed to you in all cases. The key is that you don't act differently if the user does or doesn't exist.

cutemonster · on April 29, 2020

If they don't get a verification / instructions email, they'll know an account with the username they typed, existed?

ecnahc515 · on May 12, 2020

In this case, where you're probing for user names or emails, you don't own the email, so you wouldn't receive the verification yourself, and thus wouldn't know if the account exists.

This is exactly why most password reset emails say "if you didn't request this, please let us know, as someone may be attempting to access your account".

lytefm · on May 1, 2020

You shouldn't use usernames in that scenario, just emails. After Signup, you just show a general message that a confirmation Email has been sent. If the account already exists, some policy to notify the account owner can be put in place.

lostapathy · on April 28, 2020

Verifying the email keeps someone from hijacking the account without leaking that an account formerly existed. At least so long as their email isn't also compromised - in which case they have bigger problems.

RandomBacon · on April 28, 2020

That's not much different than not being able to create an account with victim@gmail.com because victim@gmail.com already has an account. Both instance leak information

crdrost · on April 28, 2020

You don't have to track their emails unless you are reusing emails as usernames. Just tracking the username suffices.

This is also one of those situations where people often put too much shit in the user table. "we have to delete the user row" -- I mean, you have to delete some of the user row, yes.

I like to solve this by proper namespacing. Suppose you instead deliberately have an authUser table which just has what you need for auth -- a UUID to hook into the rest of the system, salts and passwords for direct logins, maybe a nullable date "banned_until" if you want banning; assuming you use crypto bearer tokens rather than an auth tokens table then you also want a column with a date date for "tokens last reset on"; etc. You can put the username in there just fine, that's needed for auth. Maybe you let people log in with email+password and thus you also put their email address in there, also fine.

As long as the authUser table does not grow to encompass all of your other business logic you are good. Other tables foreign key to authUser and you delete rows from them and that doesn't upset the foreign key. You leave the row in authUser to indicate that the username is taken.

An additional "deleted" field on authUser can be used to block logins and thus the username is taken but they can't log in. As for the email address, even if you insist on a UNIQUE and NOT NULL constraint for it (and I would find this surprising in an age where we log in a lot with social media) you can auto purge by setting it to CONCAT(id, "@purged.example") and then you have a valid email address which is nowhere else used in your auth flow, no personally-identifiable information at all. Heck then you don't even need the boolean flag if you would rather forbid the .example TLD from logging in.

So that has worked well for me in the past and it seems to solve those sorts of problems with only a little tweak. The key is that the PII need is to delete the "user row" but that does not have to be the authUser row -- if you separate the two rows out then you can leave the authUser row while still having a table appUser which lives in your application and contains all the cool stuff about this user using that app. It also naturally lends itself to you thinking about a sort of SSO for all of your different applications up-front.

bryanrasmussen · on April 28, 2020

the real GDPR problem is if the user has asked to delete data and you do this soft delete but keep all their old data as well, and then someone hacks your system and gets that data.

You're obligated by GDPR to disclose to affected parties that their data has been compromised, but you were also obligated to delete the data by GDPR.

abiogenesis · on April 28, 2020

Renaming your username to username-1000 has the exact same side effect though.

mcv · on April 28, 2020

Yeah, that is about the worst possible way to do it. If you can't do a hard delete for whatever reason, the right way to do it is to set a flag that prevents any activity on that account. They can keep the name in order to prevent anyone else from stealing it, but still delete all the profile data attached to the account.

ars · on April 28, 2020

Another option is to add an "archive" table, where deleted records are moved to that table.

It can get complex (for example foreign keys need special handling).

One option is table partitioning: https://www.2ndquadrant.com/en/blog/postgresql-12-foreign-ke...

You partition the table based on deleted or not, and then query either both tables together, or just the active table.

jackcosgrove · on April 28, 2020

Partitioning sounds like a great solution, thank you for this.

vincvinc · on April 28, 2020

Soft deletion violates the GDPR.

Read article 17:

https://gdpr-info.eu/art-17-gdpr/

And previous HN discussions:

https://news.ycombinator.com/item?id=16366050

tomwojcik · on April 28, 2020

Yes and no. PII needs to be removed. The rest of the data needs to be anonymized. Right?

Dahoon · on April 28, 2020

But an email is PII so clearly this would breach GDPR.

ericd · on April 28, 2020

Keep a hash instead of the original email address?

cryptonector · on April 28, 2020

Whatever for? Just delete it. Keep the account tombstoned so its name can't be reused. Keep what content you can and want. Delete the PII and any metadata you're contractually and/or legally required to.

DevX101 · on April 28, 2020

Did a glance through that thread and it didn't seem like there was a strong consensus on how to respect GPDR while maintaining historical data for reporting purposes. Any best practices?

jackcosgrove · on April 28, 2020

Having worked in a HIPAA regulated space, I can say that hashing the username, such as the email address, for login purposes can allow for account recovery if the credentials are retained. At the same time the cleartext username and other PII can be stored in an object that is both encrypted at rest for its lifetime, and on top of that has its sensitive fields overwritten upon logical deletion. Account recovery cannot recover non-credential derived PII but that is a small annoyance to the user in order to be compliant and trustworthy. The internal user ID should be used throughout downstream reporting rather than actual PII for the sake of continuity and privacy.

ska · on April 28, 2020

Yes this is all quite do-able. It is often much easier to implement from the beginning, than after the fact.

clairity · on April 28, 2020

the easier way, assuming neither is a primary key, is to convert the field values to UUIDs, which has the added advantage of anonymizing the data. that's disadvantageous if you want to prevent re-signups though, unless you take other measures.

cryptonector · on April 28, 2020

In general you don't want to delete absolutely everything. For example, usernames should not be reused, so you can't "delete" them -- you can tombstone them though, and you should. Besides tombstoning to prevent reuse, you can and should delete as much associated metadata as you're willing to / contractually or legally required, naturally.

Even what you can delete can (and will) survive in logs and backups, web archives, screenshots, etc. Deleting things on the Internet is just difficult.

nexuist · on April 29, 2020

>If a company states in their investor quarterly report that they had 1M active users, they better be able to prove it in an audit.

Is this even legal? I've never heard of a company letting an outside firm go through their database to confirm any sort of statistic like that. Who is doing this auditing?

chrisshroba · on April 28, 2020

Doesn't doing soft deletes on user data breach GDPR laws regarding deletion of user data when requested?

greendestiny_re · on April 29, 2020

>investor quarterly reports

God forbid the needs of the user harm the interest of the investor.

mesozoic · on April 28, 2020

Pretty sure that breaches GDPR and maybe CCPA. Maybe they can get away with anonomyzing the data but that doesn't sound like what is being done.

true_religion · on April 28, 2020

Sure, but a local paper like the New York Times is hardly subject to the laws of the EU, so GDPR doesn’t apply.

6gvONxR4sf7o · on April 28, 2020

It does if they take subscribers from the EU (or california) and apply this process to them. It's incredibly straightforward. If you do business in some jurisdiction, then that business is subject to the jurisdiction's laws.

mopsi · on April 28, 2020

Not only do they take subscribers, they actively target the European market. When I open nytimes.com, a pop-up offers 0,50€/week digital access.

true_religion · on April 29, 2020

Hmm, it seems I was wrong then. I had thought that they didn’t localize to any EU countries, but I guess they have more of a global market than the other papers I am more familiar with.

wtfishackernews · on April 28, 2020

If their website is accessible from the EU then they are subject to GDPR.

kube-system · on April 28, 2020

The EU says that GDPR applies globally.

Some violators might successfully keep their assets out of the reach of EU enforcement, but that's going to be really tough to do for any large business with global operations.

cyberowl · on April 29, 2020

Are soft deletes even legal in the context of privacy laws like GDPR? If I’m writing in to delete my data, I don’t really give a crap how hard it is. I want that permanently wiped, so that even if you wanted to you can’t find it again.

How that messes up your technical implementation is your problem

antsar · on April 28, 2020

> poor implementation

Well that's an understatement. NYT has the tech resources to do this a million better ways.

> investor quarterly report [...] active users

Speaking of "active user" counts, it's convenient that "jsmith1000" is plausibly an active user, whereas "jsmith(state:deleted)" is not. Hm.

sp332 · on April 28, 2020

Hah, Uber did this to me once. Someone signed up with my email and somehow the verification failed. So they started sending me details about someone else's trips! When I complained, instead of deactivating the account or trying to contact the user to find out their actual email, they changed the address on the account to the same thing but with "void" prepended. I have a gmail account and was pretty sure that email didn't exist... Sure enough, I tried registering the new email address with Google and got nothing but Uber spam. Oh well at least they're not sending me trip details anymore.

petercooper · on April 28, 2020

Note that "anonymization" has been legally found (DSB-D123.270/0009) to be acceptable to meet GDPR "erasure" requirements. However, this requires irrevocable overwriting of PII rather than just slapping 1000 on the end ;-) If they'd changed the username and email address to some random string, however, they would most likely be compliant.

isoskeles · on April 28, 2020

Wouldn't they also need to replace saved billing details, like address and full name, to anonymized garbage?

mschuster91 · on April 28, 2020

In Germany not; these are required by law (§147 AO, https://www.gesetze-im-internet.de/ao_1977/__147.html) to be kept for ten years.

The legal base for allowing this national rule in European law is Art. 6, 1c GDPR.

jkaplowitz · on April 28, 2020

But they'd need to get rid of it in the 11th year, right? And even before then, they'd need to delete some of the earliest records for someone who has subscribed for more than 10 years. Lots of compliance traps remain.

MaxBarraclough · on April 28, 2020

I think something similar happens in the UK. It's generally believed here that laws telling you to retain data take precedence over the GDPR telling you to delete it. (I am very much not a lawyer, as you can doubtless tell.)

_jxdz · on April 29, 2020

Yes. For example, HMRC requires that you keep various business records for 6 years (or longer, circumstance-specific) after the end of the company's financial year.

Generally, the rule is "Delete the data unless there's a law that requires you not to" — and the UK's implementation of the GDPR (the Data Protection Act 2018) makes various explicit exemptions for this.

OutsmartDan · on April 28, 2020

NYT is notoriously the worst at customer service and account handling. I tried to get a previous invoice from them previously and after 1 week of calling customer support and being passed around, I still wasn't able to get it.

heyoni · on April 28, 2020

I had this chat with their customer service department asking to "cancel" my account so that I don't incur any charges and they insisted that it wasn't possible without losing immediate access and getting a pro-rated refund. I thought that was stupid, but ok...

1 month later, still no refund. Account is still scheduled to be auto-renewed, talk to CS and they're basically ignoring me at this point (I'm using text messaging support on a secondary number), so I issue a chargeback with my credit card...

1 month later, I get a refund, no email, no text message explaining the delay and now I have to deal with that or else they'll probably put my account in collections. /facepalm

jrockway · on April 28, 2020

My credit card number changed and they sent my account to collections with no notice. The lesson I learned was never to subscribe to something on the company's own website; go through Apple.

jahlove · on April 28, 2020

Honestly, the stories i've ready about how difficult it is to cancel your NYT account are a contributing factor to why I don't subscribe. I love NYT, but if it's not as easy to cancel as it is to sign up then count me out.

asdff · on April 28, 2020

I just subscribe to the local paper (LA times) and get my national coverage from their reporting. By the time you subscribe to the wapo, nyt, economist, wsj, atlantic, you are paying a huge sum a year on redundant coverage. Better to focus on local issues that are more likely to affect my life than the national soap opera anyway.

paulcole · on April 28, 2020

Still love the story of the Pinboard guy’s approach to invoices. If you ask for an invoice, he sends you a blank one and tells you to just fill it out however you want.

the-dude · on April 28, 2020

In The Netherlands there are legal requirements for invoices.

petercooper · on April 28, 2020

Same in the UK, but for instances where you can't get a fully valid invoice, you can either "self invoice" (so the "fill in your own invoice" approach) or just whatever receipt you get as long as you feel happy defending it to a tax inspector later on.

If you use US companies for services in a European business, you are almost certainly going to have invoices every month that don't meet EU regulations at all and you just have to make it work.

paulcole · on April 28, 2020

He said in his tweet about this that when he does it to Europeans (specifically Germans, I think) they just get even madder.

I mean great if there are legal requirements for invoices, but who enforces them, how likely is enforcement, and what’s the end result for a US-based company with no physical presence in The Netherlands?

petercooper · on April 28, 2020

The real issue is with the local (European) company when they claim the expense against profits and the tax inspector turns their nose up at the invoice/receipt.

Being in the UK, we tend to work on a system where things are taken in context and you can defend such decisions. Maybe other tax regimes are more restrictive, but the British way is always that you can have a debate with authorities and usually they will see sense in your reasoning if you're not trying to defraud them.

avree · on April 28, 2020

Just to make this kind of confusing story less confusing—the Pinboard guy prints valid invoices (in order to be legally compliant, and because he's "not a totally evil guy"). Someone (from Germany) asked him to add Company Name to the invoice, and he replied by saying "just edit the HTML to add whatever you need".

https://twitter.com/i/status/1192182812121583617 The actual tweets probably explain it better than I (and the parent comment) can.

paulcole · on April 29, 2020

Ha, the tweets I remember are from years and years ago. Maybe 2015? Really funny that the invoice thing has been such a consistent part of the Pinboard Experience for so long.

Since this predates threads, I can’t find all the tweets but this is one of them:

https://twitter.com/Pinboard/status/558313726844358656

paulcole · on April 28, 2020

That all sounds like the local company's problem right?

If they want to claim the expense they can find a competing company who issues an invoice.

anticensor · on April 28, 2020

US has too, they are lower but not nonexistent. Blank invoice is not a valid invoice in the states either.

hanniabu · on April 28, 2020

I'm ashamed I'm finding out about this through an HN comment. Would you happen to have a good source to read?

inopinatus · on April 28, 2020

I’d bet (a small amount of) money they have no ability to delete accounts at all, and it goes all the way down to foreign key constraints introduced by a well-meaning but inexperienced developer that unnecessarily couple the accounts table to many other records.

koheripbal · on April 28, 2020

Regardless of the constraint on the key, the design fact remains that deleting a user record that might, for example, have associated transaction data (like subscription payments) is a little complex.

You don't want to cascade that deletion to a record of credit card charges, but you also need to make sure that all queries respect that the user record might now be deleted - ie make it an outer-join.

It's far more robust to add an active/inactive field.

The longer an organization has been around, the more interconnected the database is, and the more consequent changes are needed to accommodate a core database change.

I would be surprised if many many organizations have some hack like this under the covers.

nitrogen · on April 28, 2020

Do active/inactive fields comply with data privacy laws like those in California and the EU?

akx · on April 28, 2020

IANAL, but if you anonymize (not just pseudonymize) the PII from the user table (and others) I think so, yeah.

emodendroket · on April 28, 2020

In that case it seems simpler to introduce an IsDeleted flag than to have a convention that 1000 goes on the end of the name.

Bjartr · on April 28, 2020

Someone wanted to get this done without waiting on the DBA team so they rolled some implicit schema instead.

neovive · on April 28, 2020

This reminds me of a good use case for the Laravel Soft Deletes: https://laravel.com/docs/7.x/eloquent#soft-deleting.

marcofatica · on April 28, 2020

love this feature so much

nunez · on April 28, 2020

I'm learning about database design, and I'm learning that this might not be that easy if the relationships between users and other data on the site are ill-defined. There may be multiple tables in their design that will require knowledge of that flag, and it might legitimately be way easier to just add junk to the end of the user account than it is to introduce a new flag.

0xCMP · on April 28, 2020

Sure, but that just covers up the lie better. Maybe a flag "ToBeDeleted" and any stilled logged in sessions see account is scheduled for deletion.

asdfman123 · on April 28, 2020

> it goes all the way down to foreign key constraints introduced by a well-meaning but inexperienced developer that unnecessarily couple the accounts table to many other records.

You make it sound like developers doing things the wrong way is the exception instead of the norm.

Good developers don't, but every place I've worked at has a few chunks of the software by people who didn't know or care enough to do things the right way.

throwaway55554 · on April 28, 2020

> I’d bet (a small amount of) money they have no ability to delete accounts at all

Probably not. I bet that goes for so very many organizations. It is far easier to reinstate an accidentally (user) deleted account if it is merely marked as deleted than if it were truly deleted.

Now, how NYT is handling this is ... well, it's bad.

calvinmorrison · on April 28, 2020

probably the fact that a good portion account deletes are users doing something they want to undo in the next few days. Soft deletes make this much easier, then going back and deleting anything older than X days is simpler than expensive customer support tickets of "how do i login it says my account is deleted"

londt8 · on April 28, 2020

Why not just anonymize the data instead? Not using foreign key constraint just for the sake of GDPR sounds weird.

unicornfinder · on April 28, 2020

That and why not allow null for the foreign key constraint and set it to nullify upon deletion? Or indeed, anonymise data.

sopooneo · on April 28, 2020

I have used that pattern in my apps and found it works well. But then I've watched video from respected DB experts, that I learn a great deal from, where they practically beg you to stop using nullable columns.

So I'm torn, because I think there may just be a major problem I've not yet grown my apps big enough to suffer. Anyone have thoughts either way?

dmux · on April 28, 2020

Any particular videos you'd recommend?

cptskippy · on April 28, 2020

> Not using foreign key constraint just for the sake of GDPR sounds weird.

That isn't what he said.

He said their inability to delete an account is due to poor design of their data schema by a well intentioned developer.

ShakataGaNai · on April 28, 2020

A lot of companies do this. My buddy found out that he had two EA accounts, so he asked nicely for them to merge them into one and they "did". Well what they actually did was grant the games to the new account and "delete" the old account.

Of course "delete" meant rename it from username@custom.tld to usernameDELETED@custom.tld. He owns the entire domain (and has catchall) so he got the notification of the changed email to the new email address.

Now he has two EA accounts with all the games on both!

type0 · on April 28, 2020

> Now he has two EA accounts with all the games on both!

oh what a great service

gentleman11 · on April 28, 2020

It’s odious that it’s impossible to delete accounts except in rare circumstances. Ever try? All anybody does is temporarily disable them unless you go through an hour with their tech support. Dark pattern at best, holding on to your data forever to continue selling it at worst

throwaway55554 · on April 28, 2020

Now a days it is a dark pattern. But in the days before "sell everything you can about your users" became the rule, businesses optimized for the accidental deletion by users. They could easily reinstate you and your data would still be there. It used to be considered good customer service. Times change.

viklove · on April 28, 2020

You know HN doesn't let you delete your account, right? Even if you send them an email, apparently they're "too backed up" to handle account deletion requests.

Traster · on April 28, 2020

It's things exactly like this that were a damn good reason the EU implemented GDPR.

tobr · on April 28, 2020

No information on how they found out? Did the service rep tell them?

lubblig · on April 28, 2020

From the Twitter replies: "the number was appended to local-part, not the domain. I found out by going back to a tab where my session was still valid but the account dropdown had updated with the new name. Profile settings revealed the email." https://twitter.com/bicycult/status/1255122953798328320

avian · on April 28, 2020

One possible way I can think of is they're hosting their own mail server and route messages for all unknown addresses to some mailbox (not that uncommon as far as I know - people do that to avoid bouncing mails with a typo in the address).

With such a setup it would be possible to notice that NYT mail started coming to foo1000@... instead of foo@... after requesting account deletion. Username change could also be evident directly from the mail, or it was a trivial guess.

heyoni · on April 28, 2020

He’s saying they append 1000 to the user and domain.

/edit I was wrong. It’s the user and email. Your theory would work

gruez · on April 28, 2020

>/edit I was wrong. It’s the user and email. Your theory would work

Why? My interpretation of "append 1000 to email" is that 1000 gets added to the very end, not the local part.

heavenlyblue · on April 28, 2020

Just have a friend working at NYT who tells you how NYT actually manages account deletions while having a pint; then try to delete your account and check whether the emails come to a new email. Write an article about it.

ericol · on April 28, 2020

They are not alone in this practice. I still receive emails from sitepoint to my email (As recently as March 10th), and the user name finishes in _DELETED (Not kidding).

ornornor · on April 28, 2020

Netflix does that too. You can’t delete your account so they just append a string like “csr_morgan” in the domain so that your account is “deleted” (you can’t login anymore, because your email address technically doesn’t have an account anymore) and you can re-register with your email later if you wish.

But I’d you use the altered email and the same password, everything is still there.

Pretty sure this goes against GDPR but I was totally unsuccessful at getting my account deleted.

Nextgrid · on April 28, 2020

> Pretty sure this goes against GDPR but I was totally unsuccessful at getting my account deleted.

Did you report this to your local privacy regulator (the ICO in the UK for example)? Not saying they'll do anything (I guess the "4% of global turnover" fines aren't enough to motivate them) but at least there's a record of it, and if anything else, a proof of how useless the whole regulation is.

thewebcount · on April 28, 2020

What happens when you re-register with the same email address and then later cancel again?

DonHopkins · on April 28, 2020

WP:BEANS

https://en.wikipedia.org/w/index.php?title=Wikipedia:BEANS

Uh-huh

https://en.wikipedia.org/w/index.php?title=Wikipedia:BEANS/U...

ornornor · on April 28, 2020

I dont know... I'd be curious to find out but I honestly dont have the time or motivation to figure out their broken processes.

tomjakubowski · on April 28, 2020

How did you find this out?

ornornor · on April 28, 2020

Was still logged in when they made the change, so I could see which email they changed it to.

a_t48 · on April 28, 2020

What happens if I create an account at both foo@gmail.com and foo1000@gmail.com (or even foo+@gmail.com and foo+1000@gmail.com), and then delete the first one?

dylz · on April 28, 2020

1001

/s

basicplus2 · on April 28, 2020

"instead of actually deleting it, they simply appended '1000' to both the username and the email address. anyone could thus create an email address with that suffix and request a password request to access my info."

polote · on April 28, 2020

not an issue as long as there is no tld which ends with 1000 though

diggan · on April 28, 2020

Most likely the 1000 is appended to the local-part of the email address, not the domain, as any tool they are using for changing details most likely validates emails somehow.

yaur · on April 28, 2020

The email address doesn’t really need to be valid though. I have an old client that appended ‘|disabled’ after the email address (and torched the password) when “deleting” accounts because they needed them in the DB for audit logging.

Unless someone figures out how to register a domain ending in ‘.com|disabled’ I’m not sure how someone would be able to access those accounts.

thanksforfish · on April 28, 2020

Every week we see multiple articles about security researchers who abuses some part of the tech stack to do something weird that shows the danger in this sort of thinking.

I believe it's easy to spoof emails from the .com|disabled domain. Receiving messages, I agree, seems harder. Maybe spoof an unencrypted DNS response at the right moment? No need to actually register a domain when DNS is spoofable.[1]

If you really need to use a hack like that to disable an email, consider adding some code to your email sending logic that skips such email addresses (and always use that logic). Otherwise clever hackers have a foothold to try their tricks against.

[1] https://en.m.wikipedia.org/wiki/DNS_spoofing

waltpad · on April 29, 2020

My guess would be that they don't want to have that email accidentally used, but they would have a check in the codepath anyway, because no-one wants to see its logs spammed with myriads DNS errors when this can be avoided. And in fact, if a DNS error shows up in the logs, devs would know that somehow their code path is not completely safe, so that change on the email is perhaps a way for them to ensure that the disabled account is indeed seen as disabled by their code in every situation.

A lot of people are complaining about NYT approach, but perhaps their only fault - if one consider that not deleting for good an account is not an issue, and it seems to be a common practice in the industry - is to not use a transaction when disabling user accounts (disable email -> disable account), which is perhaps difficult with NoSQL setups?

jrockway · on April 28, 2020

In-band signalling has always been error-prone. It's never worked and it never will.

bagacrap · on April 28, 2020

"the number was appended to local-part, not the domain. I found out by going back to a tab where my session was still valid but the account dropdown had updated with the new name. Profile settings revealed the email."

hobs · on April 28, 2020

Completely wrong, dont just munge someone's email and hope it wont work, we literally have domains for this kind of stuff. https://www.iana.org/domains/reserved

gruez · on April 28, 2020

Chances of com1000 being delegated is low.

thanksforfish · on April 28, 2020

Why would it need to be designed? Email delivery depends on DNS, which is unencrypted and spoofable. Spoofing emails is also doable.

loopdoend · on April 28, 2020

If the dns lookup on a com1000 domain fails, the email won’t go anywhere.

thanksforfish · on April 28, 2020

Correct. But if a DNS response for that domain is spoofed. It will.

DNS is a very old protocol that still has lots of problems and mitigations like DNSSEC are only partially deployed.

gruez · on April 28, 2020

Email is an inherently insecure medium. If an attacker can compromise the DNS responses for your mail server, you're hosed.

Fjolsvith · on April 28, 2020

Not suprising from the Times. They don't like to publish retractions.

welcome_dragon · on April 28, 2020

Bazinga

hinkley · on April 28, 2020

Is it also possible that since they have a subscription model, they have to build their system around people leaving and coming back?

I mean imagine if HBO had deleted accounts during GoT and Westworld instead is suspending them. How many people had a 9 month subscription per year for years at a time?

jbverschoor · on April 28, 2020

Many companies do this

yepthatsreality · on April 28, 2020

Yes especially in New York. I once asked the Curb cab app to delete my account. They replied with an email that they did and I checked the app. My session was still valid and I could see my account details. All they changed my email address domain to @aol.com and 555’d my phone number.

5cott0 · on April 28, 2020

Gives a different meaning to The Privacy Project.

suizi · on April 28, 2020

How ironic considering they were pointing their fingers at "Big Tech" for not being "GDPR compliant".

bahna · on April 28, 2020

XBox live used to have a similar thing, may have changed since GDPR. When I asked for my account to be deleted they sent me instructions that said basically unfriend everyone i know and change my name to deleted - this was after several days of them looking into it for me :/

throwaway882321 · on April 28, 2020

Why 1000? Probably arbitrary but I can't help but wonder why 1000 instead of the faster 1111 or 1234, perhaps someone using NumPad?

srg0 · on April 28, 2020

Because the change was done manually.

Their system probably doesn't implement a function to delete account, or it is not easily discoverable in the UI, or it is known to be bugged. So the employee who did that thought that renaming an account was a good idea to make it appear "missing". A big numeric suffix is an obvious idea to avoid collisions with the existing and future accounts.

But when people are asked to choose a large number, numbers around 1000 and its multiples are chosen particularly often. So 999, 1000 and 1001 are very likely numbers to be picked "randomly". I don't find the reference on that, but I suppose we all have enough anecdotal evidence. Just recall what are the common port numbers of various programs. X*1000 + Y is a very common formula, where |Y| < 100.

willis936 · on April 28, 2020

I really doubt someone who thought the solution to being unable to delete an account is to change the account name put this much thought into what to change the account name to.

isoprophlex · on April 28, 2020

https://en.m.wikipedia.org/wiki/Benford%27s_law

Maybe this is what you're looking for?

recursive · on April 28, 2020

That's about numbers from "real-life" distributions, so that's a different thing.