Hacker News new | past | comments | ask | show | jobs | submit login

You could use a Bayesian model to incorporate prior information about the distribution of global medal rates into a conclusion (a posterior) about a particular country. This explanation will be simple and empirical, but ideally you'd integrate over all country-level data and the prior at once.

Count data is nice to model with a Poisson distribution, and a conjugate Gamma prior distribution. We'll assume a lot of things about the distribution of medal rates and medals which probably aren't true in reality.

For the global prior, eyeballing the medals-per-capita list for all countries, it has mean value around 3E-7 and standard deviation about 5E-7. That corresponds very roughly to the distribution Gamma(0.25, 1E6) for the prior. A Gamma with parameters (a, b) has mean value a/b and standard deviation sqrt(a)/b, so I just matched the numbers.

For a country with N medals and P population, we want to update our posterior distribution of their medal-per-capita rate based on a Gamma(0.25, 1E6) prior and a Poisson distribution. The updated posterior distribution is conveniently, Gamma(0.25+N, 1E6+P). (That's the conjugacy property).

So Grenada's medal rate, with N=1, P=100,000, has posterior distribution Gamma(1.25, 1.1E6). This has mean value 1.25/1.1E6 = 1.1E-6. Compare that mean value to the raw medal-per-capita rate of 1E-5 - in other words, if this model is to be believed, we should heavily discount Grenada's great performance. On the other hand Australia's medal rate barely changes from 9.2E-7 to 9.7E-7 by using the prior, close to Grenada's Bayesian rate. So it takes a larger population into account quite naturally.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: