## More/less incriminating digits from the Iranian election

Following my previous post where I commented on Roukema’s use of Benford’s Law on the first digits of the counts, I saw on Andrew Gelman’s blog a pointer to a paper in the Washington Post, where the arguments are based instead on the last digit. Those should be uniform, rather than distributed from Benford’s Law, There is no doubt about the uniformity of the last digit, but the claim for “extreme unlikeliness” of the frequencies of those digits made in the paper is not so convincing. Indeed, when I uniformly sampled 116 digits in {0,..,9}, my very first attempt produced the highest frequency to be 20.5% and the lowest to be 5.9%. If I run a small Monte Carlo experiment with the following R program,

```fre=0
for (t in 1:10^4){
h=hist(sample(0:9,116,rep=T),plot=F)\$inten;
fre=fre+(max(h)>.16)*(min(h)<.05)
}```

the percentage of cases when this happens is 15%, so this is not “extremely unlikely” (unless I made a terrible blunder in the above!!!)… Even moving the constraint to

`(max(h)>.169)*(min(h)<.041)`

does not produce a very unlikely probability, since it is then 0.0525.

The second argument looks at the proportion of last and second-to-last digits that are adjacent, i.e. with a difference of ±1 or ±9. Out of the 116 Iranian results, 62% are made of non-adjacent digits. If I sample two vectors of 116 digits in {0,..,9} and if I consider this occurrence, I do see an unlikely event. Running the Monte Carlo experiment

```repa=NULL
for (t in 1:10^5){
dife=(sample(0:9,116,rep=T)-sample(0:9,116,rep=T))^2
repa[t]=sum((dife==1))+sum((dife==81))
}
repa=repa/116```

shows that the distribution of repa is centered at .20—as it should, since for a given second-to-last digit, there are two adjacent last digits—, not .30 as indicated in the paper, and that the probability of having a frequency of .38 or more of adjacent digit is estimated as zero by this Monte Carlo experiment. (Note that I took 0 and 9 to be adjacent and that removing this occurrence would further lower the probability.)

### 4 Responses to “More/less incriminating digits from the Iranian election”

1. Drew Says:

My post got garbled. It should of course read

for (t in 1:run)
fre (.169*k))
*(min(out[,t])<(.041*k))

Sorry my French is not good enough so I had to post in English!

2. Drew Says:

Your simulation results are too high, I think, due to the hist function binning 0 and 1 together by default. For (max(h)>.16)*(min(h).169)*(min(h)<.041) I got around 1.4% by sampling from a multinomial:

run <- 1e5
k <- 116
out <- rmultinom(n=run,size=k,prob=rep(1,10))
fre <- 0
for (t in 1:run)
fre (.169*k))*(min(out[,t])<(.041*k))
100*fre/run

• xi'an Says:

Thanks, will look at it! Xi’an

3. First Digits and Electoral Fraud in Iran « In the Dark Says:

[…] Apparently what started this off was a post on the ArXiv by the cosmologist Boudewijn Roukema, but I first heard about it myself via a pingback from another wordpress blog.  The same blogger has written a subsequent analysis here. […]

This site uses Akismet to reduce spam. Learn how your comment data is processed.