Hacker News new | past | comments | ask | show | jobs | submit login

Stripes aren't really "scanned". They are more of a logical concept that tracks where the physical data for each column is, and only fetches what it needs.

If I understand what you are asking, let me restate: "can you apply a predicate first on column A before reading column B, so that you can avoid reading column B if the predicate on A doesn't match?".

The answer is: "sometimes". Predicates match some rows and not others, so matching rows may be mixed in with non-matching rows, so it's not always possible to avoid the IO to read column B. However, if the matching and non-matching rows happen to be separated (e.g. if your table is naturally in time order and you have a time-based predicate), then it's able to do this optimization. Please see the section on "Chunk Group Filtering".




Thanks for the in depth explanation! I look forward to exploring it more.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: