Hacker News new | past | comments | ask | show | jobs | submit login

That bug is extremely common, and the source is always the use of soft-deletes in the database. When you view the list of items (ex: inbox), the database query includes a "WHERE deleted = false" to exclude rows which have been soft-deleted. When viewing a single item (ex: message) the URL contains a unique identifier, whether an auto-increment integer, UID, etc. The query used to load one item is "WHERE id = :id" instead of the correct "WHERE id = :id AND deleted = false".

Managing soft-deletes on a database table requires an attention to detail, with every single query ever touching that table, that many developers lack the discipline to handle. Discipline aside, it's difficult for every developer on a team to remember which tables use soft-delete, and when checking that flag is or is not necessary. Finally, ORM abstractions often automate soft-delete in such a way that makes it exhausting for developers to validate every query. I've seen this bug over and over again at every company I've worked for. Happens so often it's impossible to keep count.




> Managing soft-deletes on a database table requires an attention to detail.

> Discipline aside, it's difficult for every developer on a team to remember which tables use soft-delete, and when checking that flag is or is not necessary.

That's the case where instead of "try harder not to make mistakes", you design a system so it is not possible to make them. One way would be to rename original table `raw_messages` and `create view messages as select * from raw_messages where not deleted`.


One of the problems with ORMs is that because it lets people forget about the annoying details of their databases, it also makes the forget the useful details of their databases.


Damn you! I wrote a bloody essay in a reply[1] to explain, in superfluous detail, what you summarized in one sentence. Anyone with basic knowledge of the topic would know what you mean. I need to figure out this magic people like you possess. I'm tired of rambling, when nobody will read it. Thank you for the incentive to improve.

[1] https://news.ycombinator.com/item?id=14374031


That's the kindest spontaneous compliment I've received in a while. Thank you. But: while the pithy comment might farm more imaginary internet points, the essay may actually teach a lesson to the person who doesn't yet get it.

As for writing: it's not magic, but for me it's not consciously applied processes either. If I had to guess how my earlier comment came about, I'd suggest something like this as a generative process:

1. Find two effects with a common cause (provided upthread). 2. State each effect, sharing words and rhythm to bring out contrast. 3. Omit needless words. (Thanks, Strunk/White!)

HTH.


I don't disagree that this is common, but it doesn't require special skills or extra attention to detail. Your code to load the 'thing' should always use a Loader (or whatever you choose to call it) - some abstraction that loads the thing and checks for permission. It's really not that hard, and it's the bare minimum of competence I expect from a junior web developer. If you have littered your code with SQL statements, you're almost certainly doing it wrong.


The majority of incidents I encounter with this bug occur specifically because of the complexity introduced with abstracted query building via ORMs and DBALs. Developers assume their configuration, such as appending a "deleted = false" clause, applies to every query without manually verifying each one. Yes, it's technically the developers' fault for not understanding how and when the abstraction kicks in, but that doesn't mean it's "simple", or that these cases are avoided.

I've seen abstraction layers where it's impossible to add a default clause to every SELECT query for a model. I've seen other abstractions where "AND deleted = false" can be automatically added to every SELECT query. I've also seen abstractions where that clause is added to all SELECT, UPDATE, and DELETE queries.

Here's a list of problems:

a) Developers bypassing the model, executing a complex JOIN that includes the table in question, and forgetting they need the "deleted = false". Most complex queries wind up being written as raw SQL or a parsed variant, that never executes the model behavior to append the "AND deleted = false" clause. Is it "wrong" to bypass the model? Most of the time, yes! But it happens every day. We're talking about what happens in reality, not what should happen in an ideal fantasy world.

b) Developers missing the case where they should be including soft-deleted rows. When the abstraction layer enforces a "deleted = false" on every query, it can be difficult or impossible to force backtracking to include soft-deleted rows. Back in the MyISAM days (before foreign keys), I found an "account deletion" mechanism that executed a "DELETE FROM messages WHERE userid = :userid AND deleted = false" - soft-deleted rows were not deleted when required, because an abstraction layer excluded soft-deleted rows in a DELETE query and the original developer never noticed.

c) What happens with UPDATE and DELETE queries? I've seen abstractions that only append the soft-delete mechanism to SELECTs, and others that also affect UPDATEs and DELETEs. Again, should every developer on a codebase understand in which situations the abstraction kicks in? Yes. The fact is they don't, because abstractions inherently make developers not inspect the behavior of their code as deeply as they should.

I don't remember soft-deletes being an issue at all - literally non-existent - 10 years ago, when all SQL queries were typed out by hand. When you're forced to write the query yourself, you have time to think about what you are doing. When you delegate the majority of the task to an abstraction layer that magically modifies your queries on the fly, bad things happen. The most stable and maintainable code base I ever worked on had every single query in XML files. It sounds tedious and bloated, as if it's a joke about the "old days", but every query was located somewhere where it could be analyzed, and you actually had to use your brain when writing a new query. I've seen nothing but misery since the introduction of abstracted ORMs and DBALs, where the only way you ever see the queries being executed is in debug dumps and logs.

>> competence I expect from a junior web developer

Sadly, more than half of the senior developers I've met can't handle soft-deletes properly. So no, in the real world, this cannot be expected of junior developers.


It's issues like this that really highlight the benefits of shuffling deleted data to a separate archive table through triggers, or leveraging temporal tables. It may not necessarily be as efficient as maintaining a flag, but it dryastically reduces the mental overhead placed on users of the database.


OTOH from a pro user pov deleted means removed and not made inaccessible. If someone request stuff to be removed, imo it should be removed from archives as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: