Hacker News new | past | comments | ask | show | jobs | submit login

It is an unfortunate misconception that statistical probability can be used to predict the future.

Any time you extend a statistical model temporally it immediately becomes mathematically invalid since probabilistic statistics are only valid for a fixed population at a fixed moment in time.

Unfortunately business and government is rife with people predicting the future based on statistical models that have no more mathematical validity than reading tea leaves.




What??? Prediction is certainly a type of extrapolation, but to claim that it's "mathematically invalid" reveals a severe lack of knowledge on your part. In fact, under parametric assumptions about the data generating mechanism, we can exactly quantify the expected coverage of prediction intervals. That's literally a standard topic in an introductory statistics course.


Hello, I think Calafrax is probably right. :o) I think you implicitly agree because you say "under parametric assumptions..." which means you know whats going on; but to make the point->

Statistics as we know it "works" (can be derived) under the assumptions of controlled experimental data. As a thought experiment think about the weather - we know that if we build a classifier that predicts the weather in my garden tomorrow based on the history of the weather in my garden it will do very badly. Why - well because weather is very very very complex; the range of behavior is vast. But worse, it's unstable. The weather in my garden is driven by several complex systems; the ocean, the atmosphere, the earth's orbit and sol! Statistics can't predict the future of the weather in my garden.

Statistics also can't predict other things like the future of the financial markets (not least because if you find a statistical law about that they you will act on it and then screw it up)

It's important to me to bang on about this because there are loads of people who sit through their introductory courses and read the example of predicting a biased roulette wheel. Years later they end up running the company/country/community that I live in and they have a view that they can use the same principles to do it... and this thinking leads to nasty surprises for me.


> if we build a classifier that predicts the weather in my garden tomorrow based on the history of the weather in my garden it will do very badly

Give me hourly readings of temperature, wind speed, wind direction, precipitation, cloud cover and barometric pressure for the last 10 years and I can give you a very accurate prediction of tomorrow's weather in your garden.


Hello, interestingly governments and private industries have invested a very large amount of money in the launch of satellites, development of supercomputers and code and the training of forecasters to interpret them.

Many years ago I actually seriously tried to do what you describe above, I tried out all sorts of things around seasonal analysis and other features. What kills it is the chaotic nature of UK weather due to the jetstream and NAO.


is that a joke? weather predictions are notoriously unreliable even though they are given with extreme granularity.

that aside you are missing a larger point. if you predict the future based on past data all you are saying is "the future will be the same as the past." you aren't predicting anything. you will be wrong every single time something novel occurs, which is pretty frequently in the real world.


The perception that weather forecasting is notoriously unreliable is mostly false: https://mobile.nytimes.com/2012/09/09/magazine/the-weatherma...


From your link : "Why are weather forecasters succeeding when other predictors fail? It’s because long ago they came to accept the imperfections in their knowledge. That helped them understand that even the most sophisticated computers, combing through seemingly limitless data, are painfully ill equipped to predict something as dynamic as weather all by themselves. So as fields like economics began relying more on Big Data, meteorologists recognized that data on its own isn’t enough."


Quantifying uncertainty is one of the main points of statistics. Don't confuse the limitations of point estimates provided by machine learning techniques with all of statistical practice.


i am not sure what that article is supposed to prove. it doesn't contain any study results on the accuracy of meteorological predictions.

I don't have the data handy but to the best of my recollection weather forecasting for high/low temperature and precipitation does pretty well for the range of 24-48 hours but declines steadily in accuracy, and is no better than random guess around 2 weeks out.

That said, you are not addressing my other point, which is that "weather prediction" is just saying "things are going to stay the same." You are always starting with a set of conditions and then looking at your records and seeing what happened in similar conditions and predicting that the same thing will happen again.

Predicting that things will stay the same may come out as better than random guess in many cases but it will still be 100% wrong in cases where something novel happens.


The point is, statistical prediction is definitely a thing, and is not "mathematically invalid" - it's mathematically well defined, with predictable consequences (increasing variance as the extrapolation becomes greater). Certainly, statistical models are not Crystal balls, but they never claimed to be. If you have a reasonable frequentist model and good data about an ongoing process, you should be able to make predictions with reasonable confidence bounds. If you have a reasonable Bayesian model and good data about an ongoing process, you should be able to coherently quantify your uncertainty about the future state of the system.

Obviously, this is more or less feasible in practice, depending on the phenomenon under study. Calling markets unpredictable is not evidence against the existence of rigorous frameworks for statistical prediction.

Don't let bad experiences with inexperienced and overconfident practitioners blind you to established, uncontroversial, mathematical truths.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: