interesting, but I would still prefer pandas for data cleansing/manipulation, just because I won't be limited by SQL syntax - and can always use df.apply() and/or any python package for custom processing.
pandas using apache arrow backend also makes it high performance and compatible with cloud native data lakes
plus compatibility with sklearn package makes it a killer feature, with just few lines you can bolt on ML model on top of your data
https://duckdb.org/docs/data/csv/overview.html
https://duckdb.org/docs/data/parquet/overview
https://duckdb.org/docs/data/multiple_files/overview.html