I'm surprised that data.table is so fast, and that pandas is so slow relative to it. It does explain why I've occasionally had memory issues on ~2GB data files when performing moderately complex functions. (to be fair, it's a relatively old Xeon w/ 12GB ram) I'll have to learn the nuances of data.table syntax now.