I lost interest once I came to the words "SQL Server", which I haven't used for a couple of decades. Unfortunately, those words didn't crop up until the last sentence of the article...
There is no reason to mention SQL Server for most of the article, the issue being discussed is relevant to all (or at least all common) SQL implementations. The query pattern is flawed in a fundamental way, or the table structure necessitating it is, rather than this being a SQL Server specific trap.
The only reason the product is specifically mentioned in that last part is because it mentions an optimisation that is possibly specific to SQL Server's query planner.
If you have such an allergy to the product that you can't process general advice from someone who uses it, there are clues you could have triggered your anaphylaxis earlier and saved you time: t-sql is mentioned in the site's side-bars, as is “Learn SQL Server” for that matter.
I'm not allergic to SQL Server. I didn't study the example code; I was just curious about why DISTINCT was a problem. TL;DR: for some query planners, DISTINCT requires a table sort, which requires a full table scan, which can be avoided by using a construct (GROUP BY) that gives the planner hints.
But I assumed that was dependent on the query planner; Oracle SQL, as I recall, has/had specific constructs for giving hints to the query planner.
> But I assumed that was dependent on the query planner;
Problems that can be “fixed” with DISTINCT but really shouldn't, an accidental cross-join being one I've seen many times, be are common to many (all?) SQL implementations and I don't think they are treat much better by any of them (they are generally mistakes, so prioritising optimising for them is not a good use of the query planner engineering team's time).
> Oracle SQL, as I recall, has/had specific constructs for giving hints to the query planner.
SQL Server has index hints much like Oracles, along with locking hints (which I think Oracle has similarly), and so forth, which can tweak the query planners behaviour. They are best avoided if other options are available because:
• they often only work well for a given data pattern, so you are just pushing the problem into the future when your data size has grown noticeably (in this context they are valuable as a temporary workaround, just make sure you treat them as such and work to remove the need)
• they potentially block optimisations that future versions of the query planner/engine can perform by forcing it a different way around
• sometimes they make explicit reference to objects (indexes usually), tightly coupling their names to things that others may not expect making accidental breaking changes more likely
Postgres doesn't have the same sort of index hints IIRC, but does have options that constrain the query planner in similarly to try make it prefer a particular path and/or avoid another.
Yeah, thanks. I get all that; avoid if possible any proprietary languange constructs.Try to tune the query and the schema so that no sane optimizer would do the wrong thing.
But the last time I tangled with Oracle, I was working for a bank; schema changes took months to approve, and the DBAs were GHODZ. And there was no chance of the bank changing suppliers anytime soon. And anyway, EVERYTHING was proprietary. They had their own proprietary stored-procedure language, and it wasn't dispensable - I believe people bought Oracle because of their stored procedure setup.
To be clear, I'm not in any sense advocating hinted SQL!