Hacker News new | past | comments | ask | show | jobs | submit login

In my experience pyspark is much more flaky and annoying that doing parallel computing with more 'python native' tools. It only really makes sense when you outgrown small clusters and really need huge infrastructure.



What python tools do you use for small clusters?


Dask would be an option.


Was going to say that. Or ipython parallel if you want to go lower level




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: