Hacker News new | past | comments | ask | show | jobs | submit login

I thought the usual mantra for performance-critical functions is to implement them in C and call from py.



Not always necessary. I do a lot of ETL type work in python, and generally, the idea is to never pull everything into memory at once if you don't need to. This means leveraging generators quite a bit. Read a row, process a row, generate a row, pass it along the pipeline. I've written scripts that can process a 1k line csv file with the same memory footprint as a 10M line csv.

If you have to read everything into memory, because you're doing some sort of transforms like a list to a dict or such, explicitly deleting variables can help, as well, rather than waiting for them to go out of scope, but this should be pretty rare.


Or just port to Julia, Scheme or Common Lisp, while enjoying the power of dynamic languages with compilation to native code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: