Hacker News new | past | comments | ask | show | jobs | submit login

Any idea what makes R's data.table so fast compared to the others?



I read somewhere else (Another comment I think) that it was a ground-up implementation taking a very performance orientated approach.

Basically it seemed like they really got in the weeds to make it super fast.


R is from ~2000, while pandas started in 2011. Is it possible that the lack of compute power had an effect on the required performance characteristics?


data.table is basically a highly optimized C library

https://github.com/Rdatatable/data.table


That's somewhat like libvips which was started when a 486 was state of the art - fast forward and it's an image processing monster.


R is much older than 2000, it's from 1993.


And it’s an implementation of S, originally from Bell Labs in 1976


Thank you, my brief research led to a list of versions that had R 1.0 as 2000, but it appears that v0 lasted a good many years. Pandas as well was in v0 for many years so it is the better comparison to use like-for-like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: