Hacker News new | past | comments | ask | show | jobs | submit login




interesting, but I would still prefer pandas for data cleansing/manipulation, just because I won't be limited by SQL syntax - and can always use df.apply() and/or any python package for custom processing.

pandas using apache arrow backend also makes it high performance and compatible with cloud native data lakes

plus compatibility with sklearn package makes it a killer feature, with just few lines you can bolt on ML model on top of your data




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: