Any idea what makes R's data.table so fast compared to the others?

ZephyrBlu · on March 14, 2021

I read somewhere else (Another comment I think) that it was a ground-up implementation taking a very performance orientated approach.

Basically it seemed like they really got in the weeds to make it super fast.

ColFrancis · on March 14, 2021

R is from ~2000, while pandas started in 2011. Is it possible that the lack of compute power had an effect on the required performance characteristics?

nojito · on March 14, 2021

data.table is basically a highly optimized C library

https://github.com/Rdatatable/data.table

noir_lord · on March 14, 2021

That's somewhat like libvips which was started when a 486 was state of the art - fast forward and it's an image processing monster.

juancb · on March 14, 2021

R is much older than 2000, it's from 1993.

andylynch · on March 14, 2021

And it’s an implementation of S, originally from Bell Labs in 1976

ColFrancis · on March 17, 2021

Thank you, my brief research led to a list of versions that had R 1.0 as 2000, but it appears that v0 lasted a good many years. Pandas as well was in v0 for many years so it is the better comparison to use like-for-like.