## A discrete Bernoulli factory

Posted in Books, Kids, Statistics with tags , , , , , , on October 18, 2021 by xi'an

A rather confusing (and now closed) question on X validated contained an interesting challenge of simulating an arbitrary discrete distribution using a single (standard) dice. It indeed made me think of the (more challenging) Bernoulli factory problem of simulating B(f(p)) using a B(p) simulator (with p unknown). I still do not see what the optimal solution is but the core challenge is to avoid simulating U(0,1) variate by exploiting the discrete nature of the target. Which may be an issue if the probabilities of the target are irrational and one is considering the cdf inversion approach. An alternative is to use an accept-reject approach, which also works for discrete distributions, by first deriving an instrumental distribution on the discrete support of the target from dice rolls, second finding the maximum of the ratio instrument to target, and third devising a discrete approach to selecting a generation with a probability taking a finite number of values. Which may prove quite costly. Finally, the least debatable approach is to turn the dice into a Uniform generator by using each draw as a digit in the base 5 representation of this Uniform variate, up to the precision desired for the resolution, and then apply the most efficient algorithm for the target distribution.

## R rexp()

Posted in Books, R, Statistics with tags , , , , , , , on May 18, 2021 by xi'an

Following a question on X validated about the reasons for coding rexp() following Ahrens & Dieter (1972) version, I re-read Luc Devroye’s explanations. Which boils down to an optimised implementation of von Neumann’s Exponential generator. The central result is that, for any μ>0, M a Geometric variate with failure probability exp(-μ) and Z a positive Poisson variate with parameter μ

$\mu(M+\min(U_1,\ldots,U_Z))$

is distributed as an Exp(1) random variate. Meaning that for every scale μ, the integer part and the fractional part of an Exponential variate are independent, the former a Geometric. A refinement of the above consists in choosing

exp(-μ) =½

as the generation of M then consists in counting the number of $$0’s$$ before the first $$1$$ in the binary expansion of $$U∼U(0,1)$$. Actually the loop used in Ahrens & Dieter (1972) seems to be much less efficient than counting these 0’s

> benchmark("a"={u=runif(1)
while(u<.5){
u=2*u
F=F+log(2)}},
"b"={v=as.integer(rev(intToBits(2^31*runif(1))))
sum(cumprod(!v))},
"c"={sum(cumprod(sample(c(0,1),32,rep=T)))},
"g"={rgeom(1,prob=.5)},replications=1e4)
test elapsed relative user.self
1    a  32.92  557.966    32.885
2    b  0.123    2.085     0.122
3    c  0.113    1.915     0.106
4    g  0.059    1.000     0.058


Obviously, trying to code the change directly in R resulted in much worse performances than the resident rexp(), coded in C.

## warped Cauchys

Posted in Books, Kids, R, Statistics with tags , , , on May 4, 2021 by xi'an

A somewhat surprising request on X validated about the inverse cdf representation of a wrapped Cauchy distribution. I had not come across this distribution, but its density being

$f_{WC}(\theta;\gamma)=\sum_{n=-\infty}^\infty \frac{\gamma}{\pi(\gamma^2+(\theta+2\pi n)^2)}\mathbb I_{ -\pi<\theta<\pi}$

means that it is the superposition of shifted Cauchys on the unit circle (with nice complex representations). As such, it is easily simulated by re-shifting a Cauchy back to (-π,π), i.e. using the inverse transform

$\theta = [\gamma\tan(\pi U-\pi/2)+\pi]\ \text{mod}\,(2\pi) - \pi$

## simulating Maxwell distribution

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , on April 22, 2021 by xi'an

A question that came out on X validated a few days ago is how to efficiently simulate from a distribution with density

x²φ(x).

(Obviously this density is already properly normalised since the second moment of the standard Normal  distribution is one.) The first solution that came out (by Jarle Tufto) exploits the fact that this density corresponds to a signed root of a χ²(3) variate. This is a very efficient proposal that requires a Gamma sampler and a random sign sampler. Since the cdf is available in closed form,

Φ(x)-xφ(x),

I ran a comparison with a numerical inversion, but this is much slower. I also tried an accept-reject version based on a Normal proposal with a larger variance, but even when optimising this variance, the running time was about twice as large. While checking Devroye (1986) for any possible if unlikely trick, I came upon this distribution twice (p.119 in an unsolved exercise, p.176 presented as the Maxwell distribution). With the remark that, if

X~x²φ(x),  then  Y=UX~φ(x).

Inverting this result leads to X being distributed as

sign(Y)√(Y²-2log(U)),

which recovers the original χ²(3) solution, if slightly (and mysteriously) increasing the simulation speed.

## laser sharp random number generator

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on April 1, 2021 by xi'an

Caught the headline of Science News on a super-fast random number generator based on a dysfunctional laser! Producing “254 trillion random digits per second”.

“…when the laser is shined on a surface, its light contains a constantly changing pattern of tiny pinpricks that brighten and dim randomly. The brightness at each spot in the pattern over time can be translated by a computer into a random series of ones and zeros.”

I presume this is covered in the original Science paper [which I cannot access] but the parallel series of 0’s and 1’s should be checked to produce independent Bernoulli B(½) variates before being turned into a genuine random number generator.