Hacker News new | past | comments | ask | show | jobs | submit login

Ironically - He didn't even mention indexes in his description (which he admitted was simplified) - a good query optimizer will do wonders for not only coming up with the appropriate hints for the query plan, but will also dynamically adjust those hints based on the underlying data patterns.

The example he provided,

"So a row was the unit of information and rows were stored sequentially on disk. Row orientation works for small amounts of data. But think about what happens when there are lots of rows and the user wants all rows where the license starts with 123 and the color is blue or black. In a naive system the application has to read every single byte of data from the disk."

Is something no modern database would ever do. The real challenge is not to only read the records starting with 123, or having blue/black - that part is trivially handled by every Database engine I'm familiar with. The query challenge is *do you filter on license # or color first? (If there are 1k records starting with 123 and 5mm blue/black vehicles, the order is pretty critical for performance) - that's one of the features that distinguishes query optimizers.

Columnar databases are awesome when you have columnar data to work with - I've seen 20-30x reductions in disk storage in the wild (and you can obviously create synthetic examples that go way north of that), but a well indexed SQL database backed by a solid query optimizer/planner can probably stand it's own with a columnar database in terms of lookup performance, particularly if your data is row-oriented to begin with.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: