## More/less incriminating digits from the Iranian election

Posted in Statistics with tags , , , , , , on June 21, 2009 by xi'an

Following my previous post where I commented on Roukema’s use of Benford’s Law on the first digits of the counts, I saw on Andrew Gelman’s blog a pointer to a paper in the Washington Post, where the arguments are based instead on the last digit. Those should be uniform, rather than distributed from Benford’s Law, There is no doubt about the uniformity of the last digit, but the claim for “extreme unlikeliness” of the frequencies of those digits made in the paper is not so convincing. Indeed, when I uniformly sampled 116 digits in {0,..,9}, my very first attempt produced the highest frequency to be 20.5% and the lowest to be 5.9%. If I run a small Monte Carlo experiment with the following R program,

```fre=0
for (t in 1:10^4){
h=hist(sample(0:9,116,rep=T),plot=F)\$inten;
fre=fre+(max(h)>.16)*(min(h)<.05)
}```

the percentage of cases when this happens is 15%, so this is not “extremely unlikely” (unless I made a terrible blunder in the above!!!)… Even moving the constraint to

`(max(h)>.169)*(min(h)<.041)`

does not produce a very unlikely probability, since it is then 0.0525.

The second argument looks at the proportion of last and second-to-last digits that are adjacent, i.e. with a difference of ±1 or ±9. Out of the 116 Iranian results, 62% are made of non-adjacent digits. If I sample two vectors of 116 digits in {0,..,9} and if I consider this occurrence, I do see an unlikely event. Running the Monte Carlo experiment

```repa=NULL
for (t in 1:10^5){
dife=(sample(0:9,116,rep=T)-sample(0:9,116,rep=T))^2
repa[t]=sum((dife==1))+sum((dife==81))
}
repa=repa/116```

shows that the distribution of repa is centered at .20—as it should, since for a given second-to-last digit, there are two adjacent last digits—, not .30 as indicated in the paper, and that the probability of having a frequency of .38 or more of adjacent digit is estimated as zero by this Monte Carlo experiment. (Note that I took 0 and 9 to be adjacent and that removing this occurrence would further lower the probability.)