Looking out for number one

denglish · on Sept 24, 2008

Interesting article, and a phenomenon I hadn't thought about before. One thing the article didn't really focus on that struck me as a more intuitive way of understanding the phenomenon is that 1 is the first digit hit when the order of magnitude is increased (and 2 the next). So assuming for example a particular sample rarely goes above a given number, say 250 for argument sake, that means a large sample of the results are likely to be between 100 and 199, majorly skewing the results. Even if x was 700 a first digit of 1 would be far more common than 8 or 9. At the next order of magnitude the same logic would apply.

time_management · on Sept 24, 2008

Benford's Law applies to distributions which vary over orders of magnitude, and locally flatly in the log-space. It obviously doesn't apply to adult male heights, which follow a (69, 3) normal distribution, thus 6's and 7's dominate. In the log space, this distribution is not "flat" over an order of magnitude; it's a steep "bump". However, incomes, urban populations, file sizes, and natural disasters (measured in dollar-cost or fatalities) do follow such distributions, and hence we observe Benford's Law.

To a person with some statistical background, I usually explain Benford's Law in the context of town and city populations. Obviously, there's great variation in the populations of chartered cities, from as low as 10 to over 10 million. I note there is nothing "natural" about normal distributions other than the fact that they emerge from the addition of a large number of (finite variance) variables, a process that approximately describes the determination of adult height from genes. Then consider the variables that determine a city's population. Being near water might increase the population by 50%. A strong economy over a decade might lead to 40% growth. Oppressive taxes might decrease the population by 20%. It doesn't matter what these numbers actually are; the point is that the population is the product rather than sum of such variables, so we get an approximately normal distribution in the log space. Variation in log-population goes from as low as 1.0 to over 7.0, which means we can expect approximate flatness over [4.0, 5.0). Then, approximately 30% of cities between 10,000 and 100,000 people will be between 10,000 and 20,000, which is [4.0, 4.3) in the log space.

I like the fixed-point/scale invariance explanation of Benford's Law better though, because it's more intuitive than the one I use. Still, it's not completely satisfying. It doesn't explain why Benford's Law applies to all of the distributions to which it applies, such as file sizes, urban populations (inches and dollars are purely arbitrary units, while numbers of people or bits are not) and fatality figures in natural disasters.

scott_s · on Sept 24, 2008

To be clear: are you saying denglish's rationale is incorrect? (I ask because it feels legit, but, alas, that doesn't mean it is.)

time_management · on Sept 24, 2008

His rationale is not incorrect but incomplete.

Essentially, he's arguing that since the Benford distribution of leading digits is the sole fixed point under the scaling operation, it's the most natural distribution to expect in large collection of measurements. Since units of measurement (e.g. dollars, meters, miles) represent arbitrary quantities, and the data set could be examined using literally any unit of measure (a unit of measure being a scaling operation, e.g. meters -> feet multiplies each datum by 3.26), a sufficiently large set of measured data (e.g. an almanac) can be expected to obey Benford's distribution.

Benford's Law is also not true of specific distributions that are very tight. Consider IQ. That the mean is 100 is completely arbitrary, but the standard deviation of ~15% is not. Observed ratio IQs in healthy children are log-normal with a multiplicative standard deviation of 1.15-1.16; in other-words, the 85th-percentile 6-year-old will have the cognitive maturity of an average 7-year-old, a fact that is independent of the unit of measure. (Adult "deviation IQs" are a different matter entirely, as they are "forced" to conform to a normal distribution, e.g. a person who scores in the 99.0th percentile will be "assigned" a z-score of 2.33, corresponding to an IQ of 135.) Obviously, with 50% of IQs having a leading digit of 1 and almost none having a leading digit of 2 or 3, this is not a Benford distribution. You could use a different arbitrary scaling factor, setting the median to 50 instead of 100, but then leading digits of 5 and 6 would be overrepresented, with virtually no 1s or 2s. The issue, of course, is that normal IQs are very tightly distributed in the log-space and don't span nearly an order of magnitude, so we will never get a Benford distribution no matter what scaling factor we choose.

The other problem with the OP's argument is that it doesn't apply to figures like fatality figures in natural disasters, or sizes of cities, neither of which involves an arbitrary unit, but both of which exhibit Benford-esque distributions, due to the multiplicative rather than additive compilation of the variables involved. An additive compilation (e.g. sum) of a large number of variables (e.g. height from genes) exhibits a normal distribution, for which Benford's Law does not apply. However, a multiplicative compilation (e.g. product) of a large number of random variables will have a log-normal distribution, and if the variation of X is over many orders of magnitude, its distribution will be locally flat enough (in the log-space) that Y - floor(Y), where Y = log X, will be approximately a uniform choice out of [0, 1), leading to the Benford distribution.

kylec · on Sept 24, 2008

Maybe I just don't understand why this is a big mystery. Suppose you have a range of numbers from 1 to some value between 1 and 100. If that value is picked randomly, you have 8/10 chance of including the teens, 7/10 chance of including the twenties, etc. until you have a 1/10 chance of including the eighties, and an increasingly low chance of including the nineties. Then you choose a number from that range - what are the chances it begins with 1? Far greater than if it began with 9, since numbers that begin with 1 are included 8/10 of the time.

Does this make any sense? Is there something I've missed?

swombat · on Sept 24, 2008

That doesn't have much to do with the age of the captain...

viae · on Sept 24, 2008

This is the kind of stuff that makes maths and science fascinating. This is why I want to go back to school to study maths and/or life sciences. But, I have a lot of catching up to do since that last science and math class I took in high school...

What particularly struck me about the article is that I want to apply Benford's law to pi, see if it works, and if it does not, determine if there is variance at different lengths. Then try it on pi*r^2 ...

scott_s · on Sept 24, 2008

http://www.exploratorium.edu/pi/Pi10-6.html

Try it out! It's probably a five line script. (I'd like to, but I can't justify doing a fun programming problem that's not work right now.)

time_management · on Sept 24, 2008

The digits of pi are conjectured to be "random" in the sense of being both uniformly distributed over {0, ..., 9} and possessing no n-ary serial correlations for any n, but as far as I know, this has not been proven.

Benford's Law only applies to leading digits, and only for numbers following a certain class of distributions.

It's important to note that the "Law" is not an innate mathematical property of anything, but an observed phenomenon with mathematical underpinnings. It doesn't apply exactly to any well-defined, real-world distribution ("physical constants" is not a well-defined distribution) but it applies approximately to a good number of them.

Benford's Law is exactly true on 10^X, where X is a random variable, chosen with uniform probability, from [0, 1). It's also true with X chosen from [0, n) for any integer n, because the leading digit of 10^X relies only on the non-integral part of X. When X is chosen from some other distribution that spans many orders of magnitude (say, N(4, 1)) the distribution of the fractional part (X - floor(X)) is approximately a uniform random variable from [0, 1), so Benford's Law is approximately true.

viae · on Sept 24, 2008

"The digits of pi are conjectured to be "random" in the sense of being both uniformly distributed over {0, ..., 9} and possessing no n-ary serial correlations for any n, but as far as I know, this has not been proven."

This is why I find the concept interesting. I want to see the conjectures and theories /in person./ I love mathematical theory, in general, but my knowledge is very shallow. I more-or-less understand the properties of pi that you described, but I want to test them. I also want to test the properties of Benford's law to see how they work and how they can or can't be applied. This looks like the perfect opportunity to do exactly that.