I thought the usual mantra for performance-critical functions is to implement th...

hermitdev · on July 10, 2019

Not always necessary. I do a lot of ETL type work in python, and generally, the idea is to never pull everything into memory at once if you don't need to. This means leveraging generators quite a bit. Read a row, process a row, generate a row, pass it along the pipeline. I've written scripts that can process a 1k line csv file with the same memory footprint as a 10M line csv.

If you have to read everything into memory, because you're doing some sort of transforms like a list to a dict or such, explicitly deleting variables can help, as well, rather than waiting for them to go out of scope, but this should be pretty rare.

pjmlp · on July 10, 2019

Or just port to Julia, Scheme or Common Lisp, while enjoying the power of dynamic languages with compilation to native code.