This is a great description of things that seem like they shouldn't be so difficult but can have many complications. The SQL part seems to double-down on a conceptual failure rather than demystifying it though.
A query's logic is declarative which defines the output. It's the query plan that has any sense of execution order or procedural nature to it. That's the first thing to learn. Then one can learn the fuzzy areas like dependent subqueries etc. But being able to see the equivalence between not-exists and an anti-join enables understanding and reasoning.
Using an analogy such as procedurally understanding of written queries only kicks the can further down the road, then when you're really stuck on something more complicated have no way to unravel the white lies.
> The SQL part seems to double-down on a conceptual failure rather than demystifying it though.
She talked about a mental model to help her understand the query (it can be useful), and mentioned that it probably is not how the database actually processes the query.
My point is that there should be two mental models. One for getting the correct results. Then another for doing so performantly. Being able to write many different forms of obtaining the same correct results is where this leads to combined understanding and proficiency.
An example of where muddling these ends up with real questions like "how does the db know what the select terms are when those sources aren't even defined yet?" By 'yet' they mean lexically but also procedurally.
I suspect that Julia is solely using the first kind of mental model (getting the correct result), and completely ignoring query planning. But even this model has an order to it! Three examples of how this order can manifest, that should all agree with each other:
1. The explanatory diagrams that Julia drew for the talk. These wouldn't make sense if they were in a different order.
2. The order of operations you would perform if you developed a proof of concept SQL implementation that completely ignored performance. In this example the order would be: "cats, filter, group, filter, map, sort". This is exactly the order that Julia's explanation showed.
3. The relational logic expression for this query. There should be a correspondence between this expression and this ordered list of operations, though it's somewhat annoying to state. I think it's that, assuming all the operators in the relational logic expression are binary, if you reverse the order that a subset of the operators are written in, then the operators in the tree occur in the same order as the ordered list of operations. (I don't actually know relational logic, so I'm making a prediction here. This prediction is falsifiable: you can't put the operators in the tree in an arbitrary order.)
(Side note: the order isn't completely fixed. The last two steps --- SELECT and ORDER BY --- could happen in either order.)
A query's logic is declarative which defines the output. It's the query plan that has any sense of execution order or procedural nature to it. That's the first thing to learn. Then one can learn the fuzzy areas like dependent subqueries etc. But being able to see the equivalence between not-exists and an anti-join enables understanding and reasoning.
Using an analogy such as procedurally understanding of written queries only kicks the can further down the road, then when you're really stuck on something more complicated have no way to unravel the white lies.