Hacker News new | past | comments | ask | show | jobs | submit login

Great comments. I heartily agree and support the statement about probabilistic graphical models. Just to add a couple more facets to this perspective:

'State of the art' does not always mean 'best for your task', and in fact lately depending on your field SOTA sometimes simply means 'unaffordable' for anyone whose budget is under 1 million dollars.

Try linear methods first.

Ensembles of decent models are usually good models. The point above about probability calibration can be at least somewhat mitigated by using ensemble averages.

Don't just assume "the $MODEL will figure it out" if you give it shitloads of degrees of freedom. Machine learning efficiency all comes down to efficiency of representation, and feature engineering can achieve huge payoffs if/when you incorporate domain knowledge and expertise.

Once you gain a perspective into the "universality" of statistical methods, optimization, and Bayesian probability theory, your work will become a lot easier to reason about. As an example, try to see if you can explain why least-squares fit results from the assumption that model residuals are normally distributed (and what connections this may have to statistical physics!).




> Try linear methods first

This bears repeating.


I really find it hard to believe that not everyone starts with a linear model.

But apparently that's not hip any more.


Great Points. Really appreciated. Will have to put extra effort to learn about the feature engineering part of the problem.

Also, if you know a few things about the data it becomes a little easier to explain what your model is doing and why it is producing those results.

Found a good resource which explained the trust component: https://arxiv.org/pdf/1602.04938.pdf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: