## piling up ziggurats

Posted in Books, pictures, Statistics, Travel with tags , , , , , , on June 7, 2021 by xi'an

This semester. a group of Dauphine graduate students worked under my direction on simulation problems and resorted to using the Ziggurat method developed by George Marsaglia and Wai Wan Tsang, at about the time Devroye was completing his simulation bible. The algorithm covers the half-Normal density by 2², 2⁴, 2⁸, &tc., stripes, all but one rectangles and all with the same surface v. Generating uniformly from the tail strip means generating either uniformly from the rectangle part, x<r, or exactly from the Normal tail x>r, using a drifted exponential accept-reject. The choice between both does not require the surface of the rectangle but a single simulation y=vU/f(r). Furthermore, for the other rectangles, checking first that the first coordinate of the simulated point is less than the left boundary of the rectangle above avoids computing the density. This method is incredibly powerful, once the boundaries have been determined. With 2³² stripes, its efficiency is 99.3% acceptance rate. Compared with a fast algorithm by Ahrens & Dieter (1989), it is three times faster…

## amazing appendix

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , on February 13, 2018 by xi'an

In the first appendix of the 1995 Statistical Science paper of Besag, Green, Higdon and Mengersen, on MCMC, “Bayesian Computation and Stochastic Systems”, stands a fairly neat result I was not aware of (and which Arnaud Doucet, with his unrivalled knowledge of the literature!, pointed out to me in Oxford, avoiding me the tedium to try to prove it afresco!). I remember well reading a version of the paper in Fort Collins, Colorado, in 1993 (I think!) but nothing about this result.

It goes as follows: when running a Metropolis-within-Gibbs sampler for component x¹ of a collection of variates x¹,x²,…, thus aiming at simulating from the full conditional of x¹ given x⁻¹ by making a proposal q(x|x¹,x⁻¹), it is perfectly acceptable to use a proposal that depends on a parameter α (no surprise so far!) and to generate this parameter α anew at each iteration (still unsurprising as α can be taken as an auxiliary variable) and to have the distribution of this parameter α depending on the other variates x²,…, i.e., x⁻¹. This is the surprising part, as adding α as an auxiliary variable was messing up the update of x⁻¹. But the proof as found in the 1995 paper [page 35] does not require to consider α as such as it establishes global balance directly. (Or maybe still detailed balance when writing the whole Gibbs sampler as a cycle of Metropolis steps.) Terrific! And a whiff mysterious..!

## exam question

Posted in Kids, Statistics, University life with tags , , , , , , , , , on June 24, 2016 by xi'an

A question for my third year statistics exam that I borrowed from Cross Validated: no student even attempted to solve this question…!

And another one borrowed from the highly popular post on the random variable [almost] always smaller than its mean!

## point process-based Monte Carlo

Posted in Books, Kids, Statistics, University life with tags , , , , , on December 3, 2015 by xi'an

Clément Walter from Paris just pointed me to an arXived paper he had very recently gotten accepted for publication in Statistics and Computing. (Congrats!) Because his paper relates to nested sampling. And connects it with rare event simulation via interacting particle systems. And multilevel Monte Carlo. I had missed it when it came out on arXiv last December [as the title was unrelated with nested sampling if not Monte Carlo], but the paper brings fairly interesting new results about an ideal version of nested sampling that is

1. unbiased when using an infinite number of terms;
2. always better than the standard Monte Carlo estimator, variance-wise;
3. connected with an implicit marked Poisson process; and
4. enjoying a finite variance provided the quantity of interest has an 1+ε moment.

Of course, such results only hold for an ideal version and do not address the issue of the conditional simulations required by nested sampling. (Which has an impact on the computing time as the conditional simulation becomes more and more expensive as the likelihood value increases.) The explanation therein of the approximation of tail probabilities by a Poisson estimate makes the link with deterministic nested sampling much clearer to me. Point 2 above means that the nested sampling estimate always does better than the average of the likelihood values produced by an iid or MCMC simulation from the prior distribution. The paper also borrows from the debiasing approach of Rhee and Glynn (already used by the Russian roulette) to turn truncated versions of the nested sampling estimator into an unbiased estimator, with a limited impact on the variance of the estimator. Truncation is associated with the generation of a geometric stopping time which parameter needs to be optimised. Without a more detailed reading, I am somewhat lost as to this optimisation remains feasible in complex settings… The paper contains an illustration for a Pareto distribution where optimisation and calibration can be conducted quite far. It also re-analyses the Mexican hat example of Skilling (2006), showing that our stopping rule may induce bias.

## quantile functions: mileage may vary

Posted in Books, R, Statistics with tags , , , , , , on May 12, 2015 by xi'an

When experimenting with various quantiles functions in R, I was shocked [ok this is a bit excessive, let us say surprised] by how widely the execution times would vary. To the point of blaming a completely different feature of R. Borrowing from Charlie Geyer’s webpage on the topic of probability distributions in R, here is a table for some standard distributions: I ran

```u=runif(1e7)
system.time(x<-qcauchy(u))
```

choosing an arbitrary parameter whenever needed.

Distribution Function Time
Cauchy `qcauchy` 2.2
Chi-Square `qchisq` 43.8
Exponential `qexp` 0.95
F `qf` 34.2
Gamma `qgamma` 37.2
Logistic `qlogis` 1.7
Log Normal `qlnorm` 2.2
Normal `qnorm` 1.4
Student t `qt` 31.7
Uniform `qunif` 0.86
Weibull `qweibull` 2.9

Of course, it does not mean much in that all the slow distributions (except for Weibull) are parameterised. Nonetheless, that a chi-square inversion take 50 times longer than a uniform inversion remains puzzling as to why it is not coded more efficiently. In particular, I was wondering why the chi-square inversion was slower than the Gamma inversion. Rerunning both inversions showed that they are equivalent:

```> u=runif(1e7)
> system.time(x<-qgamma(u,sha=1.5))
utilisateur système écoulé
21.534 0.016 21.532
> system.time(x<-qchisq(u,df=3))
utilisateur système écoulé
21.372 0.008 21.361
```

Which also shows how variable system.time can be.