## ziggurat algorithm

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on October 30, 2018 by xi'an

A ziggurat (Akkadian: ziqqurat, D-stem of zaqāru “to build on a raised area”) is a type of massive stone structure built in ancient Mesopotamia. It has the form of a terraced compound of successively receding stories or levels. Wikipedia

In a recent arXival, Jalalvand and Charsooghi revisit the ziggurat algorithm that simulates from a univariate distribution by finding horizontal strips that pile up on top of the target as in a ziggurat or a pyramid, hence the name. Which George Marsaglia introduced in 1963. When finely tuned the method is quite efficient. Maybe because it designs an accept-reject move for each strip of the ziggurat rather than globally. For instance, versions constructed for a Normal target are more efficient [3½ times faster] than the Box-Muller algorithm. The generalisation found in the paper divides the target into strips of equal area, rather than dominating rectangular strips of equal area, which requires some work when the target density is non-standard. For targets with unbounded support or unbounded values, a function g transforming the tail into (0,1) has to be selected. A further constraint is that the inverse cdf of the transformed g(X) has to be known. And a large part of the paper examines several scenarii towards simulating from the tail region. For unbounded densities, a similarly minute analysis is undertaken, again with requests about the target like its algebraic order.

“…the result of division of a random integer by its range is a fixed-point number which unlike a floating-point number does not enjoy increased precision near 0. When such random numbers are used in the tail algorithm they cause premature termination of the tail and large gaps between produced random numbers near the termination point.”

The paper further discusses the correction of an error common to earlier ziggurat algorithms, due to the  conversion from fixed-point to floating-point numbers, as indicated in the above quote. Although this had already been addressed by George Marsaglia in the early 1990’s.

“Ziggurat algorithm has a high setup time, so it’s not suitable for applications that require variates with frequently changing shape parameters.”

When testing the algorithm against different methods (in STL and Boost), and different distributions, the gains are between two and seven times faster, except for the Exponential target where the original ziggurat algorithm performs better. Interestingly, the gains (and the computing time) increase with the degrees of freedom for the Gamma target, in relation with Devroye’s (1986) remark on the absence of uniformly bounded execution times for this distribution. Same thing for the Weibull variates, obviously. Reflecting upon the usually costly computation of cdfs and inverse cdfs on machines and software, the inverse cdf method is systematically left behind! In conclusion, a good Sunday morning read if not of direct consequences for MCMC implementation, as warned by the authors.

## RNG impact on MCMC [or lack thereof]

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , on July 13, 2017 by xi'an

Following the talk at MCM 2017 about the strange impact of the random generator on the outcome of an MCMC generator, I tried in Montréal airport the following code on the banana target of Haario et al. (1999), copied from Soetaert and Laine and using the MCMC function of the FME package:

library(FME)
Banana <- function (x1, x2) {
return(x2 - (x1^2+1)) }
pmultinorm <- function(vec, mean, Cov) {
diff <- vec - mean
ex <- -0.5*t(diff) %*% solve(Cov) %*% diff
rdet <- sqrt(det(Cov))
power <- -length(diff)*0.5
return((2.*pi)^power / rdet * exp(ex)) }
BananaSS <- function (p) {
P <- c(p[1], Banana(p[1], p[2]))
Cov <- matrix(nr = 2, data = c(1, 0.9, 0.9, 1))
N=1e3
ejd=matrix(0,4,N)
RNGkind("Mars")
for (t in 1:N){
MCMC <- modMCMC(f = BananaSS, p = c(0, 0.7),
jump = diag(nrow = 2, x = 5), niter = 1e3)
ejd[1,t]=mean((MCMC$pars[-1,2]-MCMC$pars[1,2])^2)}


since this divergence from the initial condition seemed to reflect the experiment of the speaker at MCM 2017. Unsurprisingly, no difference came from using the different RNGs in R (which may fail to contain those incriminated by the study)…

## Monty Python generator

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , on November 23, 2016 by xi'an

By some piece of luck I came across a paper by the late George Marsaglia, genial contributor to the field of simulation, and Wai Wan Tang, entitled The Monty Python method for generating random variables. As shown by the below illustration, the concept is to flip the piece H outside the rectangle back inside the rectangle, exploiting the remaining area above the density. The fantastic part is actually that “since areas of the rectangle add to 1, the slim in-between area is exactly the tail area”! So the tiny bit between G and the flipped H is the remaining tail.In the case of a Gamma Ga(a,1) variate, the authors express this variate as the transform of another variate with a nearly symmetry density, on which the Monty Python method applies. The transform is

$q(x)=(a-1/3)(1 + x/\sqrt{16a})^3$

with -√16a<x. The second nice trick is that the density of x is provided for free by the Gamma Ga(a,1) density and the transform, thanks to the change of variable formula. One lingering question is obviously how to handle the tail part. This is handled separately in the paper, with a rather involved algorithm, but since the area of the tail is tiny, a mere 1.2% in the case of the Gaussian density, this instance occurs rarely. Very clever if highly specialised! (The case of a<1 has to be processed by the indirect of multiplying a Ga(a+1,1) by a uniform variate to the power 1/a.)

I also found out that there exists a Monte Python software, which is an unrelated Monte Carlo code in python [hence the name] for cosmological inference. Including nested sampling, unsurprisingly.

## Le Monde sans puzzle

Posted in Books, Kids, Statistics, University life with tags , , , , , , , on June 17, 2014 by xi'an

This week, Le Monde mathematical puzzle: is purely geometric, hence inappropriate for an R resolution. In the Science & Médecine leaflet, there is however an interesting central page about random generators, from the multiple usages of those in daily life to the consequences of poor generators on cryptography and data safety. The article is compiling an interview of Jean-Paul Delahaye on the topic with recent illustrations from cybersecurity. One final section gets rather incomprehensible: when discussing the dangers of seed generation, it states that “a poor management of the entropy means that an hacker can saturate the seed and take over the original randomness, weakening the whole system”. I am sure there is something real behind the imagery, but this does not make sense… Another insert mentions a possible random generator built out of the light detectors on a smartphone. And quantum physics. The society IDQ can indeed produce ultra-rapid random generators that way. And it also ran randomness tests summarised here. Using in particular George Marsaglia’s diehard battery.

Another column report that a robot passed the Turing test last week, on Turing‘s death anniversary. Meaning that 33% of the jury was convinced the robot’s answers were given by a human. This reminded me of the Most Human Human book sent to me by my friends from BYU. (A marginalia found in Le Monde is that the test was organised by Kevin Warwick…from the University of Coventry, a funny reversal of the University of Warwick sitting in Coventry! However, checking on his website showed that he has and had no affiliation with this university, being at the University of Reading instead.)