Hacker News new | past | comments | ask | show | jobs | submit login

I wonder if the quality of other data processing frameworks is also getting 'good enough' for many use-cases, and they often have less complexity than Spark. With Hadoop, many companies bought clusters because they thought they needed them, but they rotted because people didn't actually have suitable workloads. Does Databricks have a similar risk? Just based on users of our Python reporting framework[0] we are starting to see Dask/Ray displacing some workloads that would have been in Spark a few years ago. How quickly things progress in that space (e.g. Arrow/DuckDB[1]) is pretty incredible to watch.

[0] https://github.com/datapane/datapane

[1] https://duckdb.org/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: