Hacker News new | past | comments | ask | show | jobs | submit login

First of all, the author of this piece works in Department of Hygiene and Epidemiology. Research is done differently across different disciplines, so it's dangerous to try to expand this to other disciplines. For example, some fields find alpha < 0.05 acceptable and other fields do not.

But research is very weird indeed. The more conference/journal articles you read, the less you trust them. I mean, say a field accepts results alpha < 0.05. This means that 5% of everything shown is wrong.

Feel free to correct me if you have a better grasp of statistics find what I say to be wrong.




Actually, having an alpha of value x does NOT mean that 100 * x % are false. It only gives you an indication of the coverage of your experiment, which is useful when compared to other, independent studies.

I think most scientists don't understand the meaning of the p-value. There was an interesting discussion last year in the statistical blog community on that question, with leading statisticians involved in it: http://radfordneal.wordpress.com/2009/03/07/does-coverage-ma...


If you only publish results p < 0.05 then you can't say what percentage are due to chance. It could be all of them. All it tells you is how many experiments would get that significance level through chance. To know the number of results that are simply due to that effect you'd have to know the prevalence of actual positive results (ie not due to chance). If a actual positives are common then it could be much lower than 5% reported results due to chance, if actual positives are impossible then it could 100% due to chance.


The traditional notion of a 5% confidence limit comes from devising rules of thumb for agricultural research stations in the 1930's. The basic frame work is that each experiment takes many months, a large plot of land, and plenty of money. You test crop varieties that you are already confident will give a better yield in order to check that they really do so.

Suppose your initial guessing is 50:50 and over some years you run 200 tests. 100 times the crop really does yield better and most of those show up fine. 100 times the crop doesn't actually yield better and 5% of those result in false positives. You end up with around 100 true positives and 5 false positives. A positive result really means something.

Fast forward 80 years and research has changed. You have high throughput screening machines and can test 100,000 different molecules in your hunt for a new antibiotic. Suppose you have got lucky and there really is a new antibiotic in your combinatorial explosion of side chains. A p-value of 5% gives you 5000 false positives. With any luck you don't get a false negative and your new antibiotic also makes it through the initial screen. Now you have 5001 +/- 70 positives. The probability that a positive result is true is only 0.0002 or 0.02%. A positive results still means something important. You are searching for a needle in a haystack and you have discarded 95% of the hay, but there is still plenty of hay left and the 99.98% of the results are wrong.


You will have to use binomial distribution (http://en.wikipedia.org/wiki/Binomial_distribution) for finding the probability that 5% of them are wrong (Assuming researches are independent). I guess this probability will come to be very small. But the point of this article is not alpha(confidence level) ,It is the bias of the researcher.


p-values are calculated in many different ways, not always using the binomial distribution (of the p-values I've claimed, very few have used the binomial distribution)


i was referring to the chances of 5% of the papers giving wrong results. using pvalue as the chance of failure in each research paper , binomial can be used to find the probability that 5% of the papers are in correct


"I mean, say a field accepts results alpha < 0.05."

And then there are fields like climate science, where "very high confidence" means 10% probability of being wrong:

http://ipccinfo.com/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: