Hacker News new | past | comments | ask | show | jobs | submit login

Random memory access isn't really constant time when you factor in hardware prefetch and cache lines. See "What Every Programmer Should Know About Memory"



That, and regardless of row- or column-orientation, any common in-memory format that was actually as well-established as this looks like it will be in Big Data projects would be nice.


Exactly this, the most basic example is an operation on a single column. If the data for a single column is mixed in with other data in a row-oriented layout, you are going to have to bring a new chunk of memory into the CPU cache sooner to process a given amount of data. If all of the data is packed together tightly you can read many more values out of the cache before you need to go get more from main memory.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: