The majority of incidents I encounter with this bug occur specifically because of the complexity introduced with abstracted query building via ORMs and DBALs. Developers assume their configuration, such as appending a "deleted = false" clause, applies to every query without manually verifying each one. Yes, it's technically the developers' fault for not understanding how and when the abstraction kicks in, but that doesn't mean it's "simple", or that these cases are avoided.
I've seen abstraction layers where it's impossible to add a default clause to every SELECT query for a model. I've seen other abstractions where "AND deleted = false" can be automatically added to every SELECT query. I've also seen abstractions where that clause is added to all SELECT, UPDATE, and DELETE queries.
Here's a list of problems:
a) Developers bypassing the model, executing a complex JOIN that includes the table in question, and forgetting they need the "deleted = false". Most complex queries wind up being written as raw SQL or a parsed variant, that never executes the model behavior to append the "AND deleted = false" clause. Is it "wrong" to bypass the model? Most of the time, yes! But it happens every day. We're talking about what happens in reality, not what should happen in an ideal fantasy world.
b) Developers missing the case where they should be including soft-deleted rows. When the abstraction layer enforces a "deleted = false" on every query, it can be difficult or impossible to force backtracking to include soft-deleted rows. Back in the MyISAM days (before foreign keys), I found an "account deletion" mechanism that executed a "DELETE FROM messages WHERE userid = :userid AND deleted = false" - soft-deleted rows were not deleted when required, because an abstraction layer excluded soft-deleted rows in a DELETE query and the original developer never noticed.
c) What happens with UPDATE and DELETE queries? I've seen abstractions that only append the soft-delete mechanism to SELECTs, and others that also affect UPDATEs and DELETEs. Again, should every developer on a codebase understand in which situations the abstraction kicks in? Yes. The fact is they don't, because abstractions inherently make developers not inspect the behavior of their code as deeply as they should.
I don't remember soft-deletes being an issue at all - literally non-existent - 10 years ago, when all SQL queries were typed out by hand. When you're forced to write the query yourself, you have time to think about what you are doing. When you delegate the majority of the task to an abstraction layer that magically modifies your queries on the fly, bad things happen. The most stable and maintainable code base I ever worked on had every single query in XML files. It sounds tedious and bloated, as if it's a joke about the "old days", but every query was located somewhere where it could be analyzed, and you actually had to use your brain when writing a new query. I've seen nothing but misery since the introduction of abstracted ORMs and DBALs, where the only way you ever see the queries being executed is in debug dumps and logs.
>> competence I expect from a junior web developer
Sadly, more than half of the senior developers I've met can't handle soft-deletes properly. So no, in the real world, this cannot be expected of junior developers.
I've seen abstraction layers where it's impossible to add a default clause to every SELECT query for a model. I've seen other abstractions where "AND deleted = false" can be automatically added to every SELECT query. I've also seen abstractions where that clause is added to all SELECT, UPDATE, and DELETE queries.
Here's a list of problems:
a) Developers bypassing the model, executing a complex JOIN that includes the table in question, and forgetting they need the "deleted = false". Most complex queries wind up being written as raw SQL or a parsed variant, that never executes the model behavior to append the "AND deleted = false" clause. Is it "wrong" to bypass the model? Most of the time, yes! But it happens every day. We're talking about what happens in reality, not what should happen in an ideal fantasy world.
b) Developers missing the case where they should be including soft-deleted rows. When the abstraction layer enforces a "deleted = false" on every query, it can be difficult or impossible to force backtracking to include soft-deleted rows. Back in the MyISAM days (before foreign keys), I found an "account deletion" mechanism that executed a "DELETE FROM messages WHERE userid = :userid AND deleted = false" - soft-deleted rows were not deleted when required, because an abstraction layer excluded soft-deleted rows in a DELETE query and the original developer never noticed.
c) What happens with UPDATE and DELETE queries? I've seen abstractions that only append the soft-delete mechanism to SELECTs, and others that also affect UPDATEs and DELETEs. Again, should every developer on a codebase understand in which situations the abstraction kicks in? Yes. The fact is they don't, because abstractions inherently make developers not inspect the behavior of their code as deeply as they should.
I don't remember soft-deletes being an issue at all - literally non-existent - 10 years ago, when all SQL queries were typed out by hand. When you're forced to write the query yourself, you have time to think about what you are doing. When you delegate the majority of the task to an abstraction layer that magically modifies your queries on the fly, bad things happen. The most stable and maintainable code base I ever worked on had every single query in XML files. It sounds tedious and bloated, as if it's a joke about the "old days", but every query was located somewhere where it could be analyzed, and you actually had to use your brain when writing a new query. I've seen nothing but misery since the introduction of abstracted ORMs and DBALs, where the only way you ever see the queries being executed is in debug dumps and logs.
>> competence I expect from a junior web developer
Sadly, more than half of the senior developers I've met can't handle soft-deletes properly. So no, in the real world, this cannot be expected of junior developers.