## another riddle

Posted in Books, Kids, R with tags , , , , , , on March 29, 2016 by xi'an

A very nice puzzle on The Riddler last week that kept me busy on train and plane rides, runs and even in between over the weekend. The core of the puzzle is about finding the optimal procedure to select k guesses about the value of a uniformly random integer x in {a,a+1,…,b}, given that each guess y produces the position of x respective to y (less, equal, or more). If y=x at one stage, the player wins x. Optimal being defined as maximising the expected gain. After some (and more) experimentation, I found that, when b-a is large enough [depending on k], the optimal guess at stage i is b-f(i) with f(k)=0 and f(i-1)=2f(i)+1. For the values given on The Riddler, a=1,b=1000,k=9, my solution is to first guess at y=1000-f(9)=255 and this produces a gain of 380.31 with a probability of winning of 0.510, which seems amazingly large, but not so much when considering that 2⁹ is close to 500. Continue reading

## the last digit of e

Posted in Kids, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , on March 3, 2016 by xi'an

Éric Marchand from Sherbrooke, Québec [historical birthplace of MCMC, since Adrian Smith gave his first talk on his Gibbs sampler there, in June 1989], noticed my recent posts about the approximation of e by Monte Carlo methods and sent me a paper he wrote in The Mathematical Gazette of November 1995 [full MCMC era!] about original proofs on the expectation of some stopping rules being e, like the length of increasing runs. And Gnedenko’s uniform summation until exceeding one. Amazing that this simple problem generated so much investigation!!!

## Гнеде́нко and Forsythe [and e]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , on February 16, 2016 by xi'an

In the wake of my earlier post on the Monte Carlo estimation of e and e⁻¹, after a discussion with my colleague Murray Pollock (Warwick) Gnedenko’s solution, he pointed out another (early) Monte Carlo approximation called Forsythe’s method. That is detailed quite neatly in Luc Devroye’s bible, Non-uniform random variate generation (a free bible!). The idea is to run a sequence of uniform generations until the sequence goes up, i.e., until the latest uniform is larger than the previous one. The expectation of the corresponding stopping rule, N, which is the number of generations the uniform sequence went down continuously is then e, while the probability that N is odd is e⁻¹, most unexpectedly! Forsythe’s method actually aims at a much broader goal, namely simulating from any density of the form g(x) exp{-F(x)}, F taking values in (0,1). This method generalises von Neuman’s exponential generator (see Devroye, p.126) which only requires uniform generations.

This is absolutely identical to Gnedenko’s approach in that both events have the same 1/n! probability to occur [as pointed out by Gérard Letac in a comment on the previous entry]. (I certainly cannot say whether or not one of the authors was aware of the other’s result: Forsythe generalised von Neumann‘s method around 1972, while Gnedenko published Theory of Probability at least in 1969, but this may be the date of the English translation, I have not been able to find the reference on the Russian wikipedia page…) Running a small R experiment to compare both distributions of N, the above barplot shows that they look quite similar:

n=1e6
use=runif(n)
# Gnedenko's in action:
gest=NULL
i=1
while (i<(n-100)){
sumuse=cumsum(use[i:(i+10)])
if (sumuse[11]<1])
sumuse=cumsum(use[i:(i+100)])
j=min((1:length(sumuse))[sumuse>1])
gest=c(gest,j)
i=i+j}
#Forsythe's method:
fest=NULL
i=1
while (i<(n-100)){
sumuse=c(-1,diff(use[i:(i+10)]))
if (max(sumuse)<0])
sumuse=c(-1,diff(use[i:(i+100)]))
j=min((1:length(sumuse))[sumuse>0])
fest=c(fest,j)
i=i+j}

And the execution times of both approaches [with this rudimentary R code!] are quite close.

## The answer is e, what was the question?!

Posted in Books, R, Statistics with tags , , , , , on February 12, 2016 by xi'an

A rather exotic question on X validated: since π can be approximated by random sampling over a unit square, is there an equivalent for approximating e? This is an interesting question, as, indeed, why not focus on e rather than π after all?! But very quickly the very artificiality of the problem comes back to hit one in one’s face… With no restriction, it is straightforward to think of a Monte Carlo average that converges to e as the number of simulations grows to infinity. However, such methods like Poisson and normal simulations require some complex functions like sine, cosine, or exponential… But then someone came up with a connection to the great Russian probabilist Gnedenko, who gave as an exercise that the average number of uniforms one needs to add to exceed 1 is exactly e, because it writes as

$\sum_{n=0}^\infty\frac{1}{n!}=e$

(The result was later detailed in the American Statistician as an introductory simulation exercise akin to Buffon’s needle.) This is a brilliant solution as it does not involve anything but a standard uniform generator. I do not think it relates in any close way to the generation from a Poisson process with parameter λ=1 where the probability to exceed one in one step is e⁻¹, hence deriving  a Geometric variable from this process leads to an unbiased estimator of e as well. As an aside, W. Huber proposed the following elegantly concise line of R code to implement an approximation of e:

1/mean(n*diff(sort(runif(n+1))) > 1)

Hard to beat, isn’t it?! (Although it is more exactly a Monte Carlo approximation of

$\left(1-\frac{1}{n}\right)^n$

which adds a further level of approximation to the solution….)