Hacker News new | past | comments | ask | show | jobs | submit login

I'm surprised that data.table is so fast, and that pandas is so slow relative to it. It does explain why I've occasionally had memory issues on ~2GB data files when performing moderately complex functions. (to be fair, it's a relatively old Xeon w/ 12GB ram) I'll have to learn the nuances of data.table syntax now.



I'm convinced that data.table is wizardry.

For anyone who's turned off by dt[i, j, by=k], Andrew Brooks has a good set of examples at http://brooksandrew.github.io/simpleblog/articles/advanced-d.... Data Camp's Cheat Sheet is also a good resource https://s3.amazonaws.com/assets.datacamp.com/blog_assets/dat....


Are you using Pandas >= 1.0? I noticed a big speedup without changing my code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: