Hacker News new | past | comments | ask | show | jobs | submit login

The query optimizer improvements are pretty cool too.

Even though, I don't really understand that one : "The IN-early-out optimization: When doing a look-up on a multi-column index and an IN operator is used on a column other than the left-most column, then if no rows match against the first IN value, check to make sure there exist rows that match the columns to the right before continuing with the next IN value. ". I would think there's no need to check the right column(s) if the leftmost one has no match...




Suppose your query is:

   SELECT * FROM tab WHERE key1=1 AND key2 IN (2,3,4,5);
SQLite starts by doing a single b-tree lookup on the index on (1,2) - composed from the key1 field and the first possibility of the key2 field. If that works, then it proceeds to look up (1,3), (1,4), and (1,5). But if the (1,2) lookup fails, then it backs off and tries just (1,) to see if that matches anything at all. If (1,) finds any record, the search proceeds with (1,3), (1,4),etc. But if (1,*) fails, the search stops immediately.

The insight here is that a multi-column key value can be resolved using a single binary search. It is not a sequence thing where we first look for the key1=1 and then do a separate lookup in a subtree for key2. Both key1 and key2 are resolved in the same binary search.


This is what I expected the optimization to be, except I'm still not sure that I understand the wording of "that match the columns to the right", I'd expect that to be "that match the columns to the left", after all, you're checking the existence of (1,* ), not of (* ,2) or (* ,3).


That makes sense. I completely misunderstood the release note. Thanks for the clarification.

The insight is very interesting. I always thought there would be 2 separate binary searches...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: