More/less incriminating digits from the Iranian election

Following my previous post where I commented on Roukema’s use of Benford’s Law on the first digits of the counts, I saw on Andrew Gelman’s blog a pointer to a paper in the Washington Post, where the arguments are based instead on the last digit. Those should be uniform, rather than distributed from Benford’s Law, There is no doubt about the uniformity of the last digit, but the claim for “extreme unlikeliness” of the frequencies of those digits made in the paper is not so convincing. Indeed, when I uniformly sampled 116 digits in {0,..,9}, my very first attempt produced the highest frequency to be 20.5% and the lowest to be 5.9%. If I run a small Monte Carlo experiment with the following R program,

fre=0
for (t in 1:10^4){
   h=hist(sample(0:9,116,rep=T),plot=F)$inten;
   fre=fre+(max(h)>.16)*(min(h)<.05)
   }

the percentage of cases when this happens is 15%, so this is not “extremely unlikely” (unless I made a terrible blunder in the above!!!)… Even moving the constraint to

(max(h)>.169)*(min(h)<.041)

does not produce a very unlikely probability, since it is then 0.0525.

The second argument looks at the proportion of last and second-to-last digits that are adjacent, i.e. with a difference of ±1 or ±9. Out of the 116 Iranian results, 62% are made of non-adjacent digits. If I sample two vectors of 116 digits in {0,..,9} and if I consider this occurrence, I do see an unlikely event. Running the Monte Carlo experiment

repa=NULL
for (t in 1:10^5){
    dife=(sample(0:9,116,rep=T)-sample(0:9,116,rep=T))^2
    repa[t]=sum((dife==1))+sum((dife==81))
    }
repa=repa/116

shows that the distribution of repa is centered at .20—as it should, since for a given second-to-last digit, there are two adjacent last digits—, not .30 as indicated in the paper, and that the probability of having a frequency of .38 or more of adjacent digit is estimated as zero by this Monte Carlo experiment. (Note that I took 0 and 9 to be adjacent and that removing this occurrence would further lower the probability.)

4 Responses to “More/less incriminating digits from the Iranian election”

  1. My post got garbled. It should of course read

    for (t in 1:run)
    fre (.169*k))
    *(min(out[,t])<(.041*k))

    I typically get about 6.2% to your 15% and 1.4% to your 5.25%.

    Sorry my French is not good enough so I had to post in English!

  2. I am tardy in reading about the analyses of the Iranian election. Belated thanks for your posts and the links you provided.

    Your simulation results are too high, I think, due to the hist function binning 0 and 1 together by default. For (max(h)>.16)*(min(h).169)*(min(h)<.041) I got around 1.4% by sampling from a multinomial:

    run <- 1e5
    k <- 116
    out <- rmultinom(n=run,size=k,prob=rep(1,10))
    fre <- 0
    for (t in 1:run)
    fre (.169*k))*(min(out[,t])<(.041*k))
    100*fre/run

  3. […] Apparently what started this off was a post on the ArXiv by the cosmologist Boudewijn Roukema, but I first heard about it myself via a pingback from another wordpress blog.  The same blogger has written a subsequent analysis here. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.