I do big data stuff mostly. A lot of that involves cleansing. So working with large excel/csv/other tabular data formats is second nature to me. Familiar with many technical tools that I could maybe use as force-multiplier (e.g. serverless SQL processing, OLAP, RDBMS, python pandas, spark), hence my interest since if we relax the condition that every single row is accounted for, my productivity would likely be pretty high
The customers always find the rows that are not accounted for, it's bad form to drop data. We sell high quality data, which you can't claim if you can't account for every single row. You should be careful about delivering an incomplete data-set.