Hacker News new | past | comments | ask | show | jobs | submit login
The Fallacy of Placing Confidence in Confidence Intervals (learnbayes.org)
47 points by xtacy on Sept 13, 2015 | hide | past | favorite | 9 comments



Statistician here, 90+% frequentist

It is not easy to be a Bayesian or a frequentist. For insights, I recommend Brad Efron's article entitled `Why isn't everyone a Bayesian?', widely available online and not at all technical or pedantic. One of the difficulties is that the Bayes prescription asks a lot of the user. You have to supply a lot of distributional information. In high dimensional problems, the mathematically convenient answers might not be benign. [The article predates an explosion in Bayesian computational methods but is still relevant.]

From the Bayesian side, I recommend reading Andrew Gelman's blog. He points out many of the failings from frequentist approaches, but also acknowledges that many of those things can happen for Bayesian approaches.

The submarine example looks to me to be very contrived. It is one where all the information you need for a Bayesian approach is simply given. Real world problems are messier. For instance, what prior distribution should a drug company assign to the effectiveness of a new drug they've spent half a billion dollars developing? Should the FDA use the same distribution?

I'm pretty sure that a partisan frequentist could come up with examples where the Bayes approach fails, maybe in a funny way to boot.

There are tricky interpretation issues with confidence intervals, and they depend on assumptions in the `all models are wrong but some are useful' way. But a much bigger problem is p-hacking and unacknowledged multiple testing which leads to more false discoveries. Much the same would happen in practice with Bayesian approaches. Those false discoveries are valuable to people who think something is better than nothing when reporting their findings. They can get what they want by Bayesian means, not just by frequentist ones.

I'm not worried that the confidence interval either has or has not got the true value inside of it. The same is true of a credible interval. The Bayes approach gets a probability statement by making the true value random, and the frequentist approach gets one by making the end points random. If you want a probability and you want an interval, it seems like you're stuck with this is-or-isn't issue. One way to look at your interval is like it is a lottery ticket. The winning number was already drawn but not announced. You can state a probability in terms of how many tickets were sold, but really you either won or you didn't. The probability is still useful in thinking about the interval.


Could someone explain what this implies?

As a non-mathematical person who sometimes uses these tests in biological experiments, I find it frustrating why statisticians can't just give us something we could indeed use with confidence that it actually tells us what we think it tells us...


The beginning of the article discusses common fallacious interpretations of confidence intervals (i.e. what most of us probably naively think a confidence interval means, e.g. 10 ±2 95% means there is a 95% probability that the true value is between 8 and 12). The authors then go back to basics and explain what a confidence interval actually is (that, on average, experiments conducted the way this experiment was would have had a 95% chance of capturing the correct value inside the interval -- a subtle but important distinction). Then they create a neat example which shows how four different approaches to calculating confidence-interval-like-things work in practice. Then they go into technical details I couldn't be bothered reading. Finally, they suggest that bayesian "credible intervals" do a much better job of providing what people intuit a confidence interval to be.

As someone trained in classic statistical methods who has not had occasion to really use Bayesian methods, one of the things I like about Bayesian approaches is that they seem much more real world and practical to me. E.g. most statistical analysis tools assume a bunch of ridiculous and impossible things (e.g. that you do not look at the data until it has all been collected; that every time you look at the data you need to add a "degree of freedom" meaning that looking at the data 10 times produces 10 increasingly less reliable results -- literally NO-ONE does this and yet it is crucial for the theory to work.)

The Bayesian approach is more like "as you get data, make and refine estimates of what the data means" which is how we actually do things, and so -- everything else aside -- we aren't violating the approach's base assumptions every time we use it.


> had a 95% chance of capturing the correct value inside the interval

I'm still not clear on this, what does it mean? I also tried to read the linked text but it's quite dense and drawn out.


The experiment has already taken place and its findings are either correct or incorrect. It's not a Schrodinger's Cat, the Cat is alive or dead.

Imagine two studies have been completed and they have non-overlapping confidence intervals each of 95%. Does each have a 95% chance of having captured the result? Obviously not -- at most one may be correct.


Thanks! I'm glad to know I had the correct interpretation of confidence intervals!


Well, assuming I got it right :-)


I find it frustrating why statisticians can't just give us something we could indeed use with confidence that it actually tells us what we think it tells us...

This is on a slightly different statistical methodology issue, but quoting from Brad Efron (http://statweb.stanford.edu/~ckirby/brad/papers/2005BayesFre...):

The physicists I talked with were really bothered by our 250 year old Bayesian-frequentist argument. Basically there’s only one way of doing physics but there seems to be at least two ways to do statistics, and they don’t always give the same answers.

This says something about the special nature of our field. Most scientists study some aspect of nature, rocks, stars, particles; we study scientists, or at least scientific data. Statistics is an information science, the first and most fully developed information science. Maybe it’s not surprising then that there is more than one way to think about an abstract subject like “information”.


Thanks, that was an interesting read!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: