You can do that with other tools too. https://duckdb.org/docs/data/csv/overview....

slt2021 · on Oct 6, 2023

interesting, but I would still prefer pandas for data cleansing/manipulation, just because I won't be limited by SQL syntax - and can always use df.apply() and/or any python package for custom processing.

pandas using apache arrow backend also makes it high performance and compatible with cloud native data lakes

plus compatibility with sklearn package makes it a killer feature, with just few lines you can bolt on ML model on top of your data