## simulating a sum of Uniforms

Posted in Statistics with tags , , , , , , , , , , on May 1, 2020 by xi'an

When considering the distribution of the sum (or average) of N Uniform variates, called either Irwin-Hall for the sum or Bates for the average, simulating the N uniforms then adding them shows a linear cost in N. The density of the resulting variate is well-known,

$f_X(x;N)=\dfrac{1}{2(N-1)!}\sum_{k=0}^N (-1)^k{N \choose k} (x-k)^{N-1}\text{sign}(x-k)$

but similarly is of order N. Furthermore, controlling the terms in the alternating sum may prove delicate, as shown by the R function unifed::dirwin.hall() whose code

for (k in 0:floor(x)) ret1 <- ret1 + (-1)^k * choose(n, k) *
(x - k)^(n - 1)


quickly becomes unreliable (although I managed an easy fix by using logs and a reference value of the magnitude of the terms in the summation). There is however a quick solution provided by [of course!] Devroye (NURVG, Section XIV.3, p.708), using the fact that the characteristic function of the Irwin-Hall distribution [for Uniforms over (-1,1)] is quite straightforward

$\Phi_N(t) = [\sin(t)/t]^N$

which means the density can be bounded from above and results in an algorithm (NURVG, Section XIV.3, p.714) with complexity at most N to the power 5/8, if not clearly spelled out in the book. Obviously, it can be objected that for N large enough, like N=20, the difference between the true distribution and the CLT approximation is quite negligible (reminding me of my early simulating days where generating a Normal was done by averaging a dozen uniforms and properly rescaling!). But this is not an exact approach and the correction proves too costly. As shown by Section XIV.4 on the simulation of sums in NURVG. So… the game is afoot!

## another riddle

Posted in Books, Kids, R with tags , , , , , , on March 29, 2016 by xi'an

A very nice puzzle on The Riddler last week that kept me busy on train and plane rides, runs and even in between over the weekend. The core of the puzzle is about finding the optimal procedure to select k guesses about the value of a uniformly random integer x in {a,a+1,…,b}, given that each guess y produces the position of x respective to y (less, equal, or more). If y=x at one stage, the player wins x. Optimal being defined as maximising the expected gain. After some (and more) experimentation, I found that, when b-a is large enough [depending on k], the optimal guess at stage i is b-f(i) with f(k)=0 and f(i-1)=2f(i)+1. For the values given on The Riddler, a=1,b=1000,k=9, my solution is to first guess at y=1000-f(9)=255 and this produces a gain of 380.31 with a probability of winning of 0.510, which seems amazingly large, but not so much when considering that 2⁹ is close to 500. Continue reading