Hacker News new | past | comments | ask | show | jobs | submit login

A couple of us inherited a machine learning project a while back. The code was horrible. Riddled with copy pasta (nearly half of the entire thing was copy paste and no code reuse). We basically refactored everything, standardized input and output file names. We put up a small Flask service to allow outside services hit it easily and wrapped it up in a Docker container so it was ultimately easy to deploy. Yes it was all the plumbing. However we also looked at the code, and the ML strategies, and while there was "some" level of competence, it was nothing more than word2vec add and divide. Totally horrible for actually finding key phrases that matter to the subject we're matching. So we started tackling that too with LSTM but our time got cut short and shifted off to another area. So not only was the "scientist" they hired completely crappy at the engineering, they weren't really helpful in the ML either.

This is obviously of lesser value to the topic at hand, and more about making sure you hire good people I think.




This is 100% my experience. I got hired as a ML engineer to bring a data scientists models into production. I did the same as you by tearing the whole thing apart and engineering it properly. I also look at the models, and oh boy... that data scientist had no idea what he was doing. Couldn't explain why he chose the model, didn't have any performance metrics (or even knew what metric to use to measure the performance) and just generally did not understand the basic concepts of his fields. I had to try really hard to drag answers out of him, but in the end I came out dissatisfied.


I am curious what your take is on things like this article:

https://managingml.substack.com/p/the-myth-that-machine-lear...

It has been my experience too. Basically, ML / DS engineers are thrown under the bus for being poor general software engineers, but in practice it’s totally the opposite.


The problem is that ML engineers are not the people who wrote GP's garbage code. Data scientists wrote it, and I know at least a few of my very intelligent, high-functioning data scientist colleagues who are alarmingly, astoundingly bad programmers.


For me it's just the one experience since I haven't had any other interactions with an ML / DS person since.


The phrase "If you can't dazzle them with brilliance, baffle them with bullshit" comes to mind.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: