Versions of Benford’s Law

A new arXived note by Berger and Hill discusses how [my favourite probability introduction] Feller’s Introduction to Probability Theory (volume 2) gets Benford’s Law “wrong”. While my interest in Benford’s Law is rather superficial, I find the paper of interest as it shows a confusion between different folk theorems! My interpretation of Benford’s Law is that the first significant digit of a random variable (in a basis 10 representation) is distributed as

$f(i) \propto \log_{10}(1+\frac{1}{i})$

and not that $\log(X) \,(\text{mod}\,1)$ is uniform, which is the presentation given in the arXived note…. The former is also the interpretation of William Feller (page 63, Introduction to Probability Theory), contrary to what the arXived note seems to imply on page 2, but Feller indeed mentioned as an informal/heuristic argument in favour of Benford’s Law that when the spread of the rv X is large,  $\log(X)$ is approximately uniformly distributed. (I would no call this a “fundamental flaw“.) The arXived note is then right in pointing out the lack of foundation for Feller’s heuristic, if muddling the issue by defining several non-equivalent versions of Benford’s Law. It is also funny that this arXived note picks at the scale-invariant characterisation of Benford’s Law when Terry Tao’s entry represents it as a special case of Haar measure!

5 Responses to “Versions of Benford’s Law”

1. […] the “vague and mistakenly mystical sense of universality” the paper conclude with. (Zipf’s law, which pops up at each controversial election, appears in the list of well-supported fits, but I […]

2. Paulo C. Marques F. Says:

And your copy still has the dustjacket…

That’s the sign of an old love…

• This isn’t my copy! When I bought it (circa 1988), the bookstores gave no jacket any longer…

3. I too love Feller’s book and find it extremely hard to believe that he would be “wrong” about something as simple as Benford’s Law.

Feller does, however, have one buffonish thing in his book–I can’t remember whether it’s in volume 1 or 2, and my copies are an ocean away–which is an ill-informed and mocking dismissal of Bayesian inference. I can’t imagine that Feller ever thought much about that particular topic, and I imagine he was influenced by some no-nothing colleague. I found it pretty sad to see Feller not only make a mistake, but do it in such a smug way. A lesson for us all, I suppose: if the great Feller could make a fool of himself in this way, so can we.

• Andrew: I had the same reaction as yours and this why I immediately went to check the book. At the bottom of page 63. there is indeed this hand-waving argument about the uniform log-transform for justifying the common occurrence of (the correct) Bendford’ Law on the first significant digit… So Feller got his intuition somehow wrong, not his definition. An interesting mathematical question remains though, which is why the law is so common in data collections. There is a kind of weak “central limit” theorem linked to the product of arbitrary rv’s but this is not satisfactory.