Hacker News new | past | comments | ask | show | jobs | submit login

Ease of use might be the criteria when you are a student. However as soon as you start to depend on it for a living you realise that scikit-learn made enough serious mistakes such to have lost my trust in it and I am forced to pay the ~$15.000 for matlab until some alternative is available.



Often I will do some prototyping with scikit-learn and then write my own implementation in numpy / scipy for something that goes into production. But I have used scikit-learn in production as well without issue. I have used MATLAB a bit and it is quite nice for figuring things out / prototyping. But the issue I have with it is that it's not typically intended for production software. So then you often need to reimplement your MATLAB code in whatever "platform" language you're using. That's why I've largely transitioned away from MATLAB.


Matlab does have code generation though. Dunno if the library you use supports it, but most code can be exported to C++ or Cuda code.


I was not aware of the links you shared pointing out the inconsistencies. I wonder how the authors themselves respond about these reddit posts (if at all?). Thank you for sharing!

Despite that, it does have some implementations that made it stick out for me across all other languages, such as the fit / transform / predict API spread across the library, and the useage of joblib as back-end for speedup - this allows their models to be easily scaleable on clusters with the use of Dask.

I still have confidence that their most used functions (e.g. RandomForest) and models are still correctly implemented and provide great value in that regard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: