## ziggurat algorithm

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on October 30, 2018 by xi'an

A ziggurat (Akkadian: ziqqurat, D-stem of zaqāru “to build on a raised area”) is a type of massive stone structure built in ancient Mesopotamia. It has the form of a terraced compound of successively receding stories or levels. Wikipedia In a recent arXival, Jalalvand and Charsooghi revisit the ziggurat algorithm that simulates from a univariate distribution by finding horizontal strips that pile up on top of the target as in a ziggurat or a pyramid, hence the name. Which George Marsaglia introduced in 1963. When finely tuned the method is quite efficient. Maybe because it designs an accept-reject move for each strip of the ziggurat rather than globally. For instance, versions constructed for a Normal target are more efficient [3½ times faster] than the Box-Muller algorithm. The generalisation found in the paper divides the target into strips of equal area, rather than dominating rectangular strips of equal area, which requires some work when the target density is non-standard. For targets with unbounded support or unbounded values, a function g transforming the tail into (0,1) has to be selected. A further constraint is that the inverse cdf of the transformed g(X) has to be known. And a large part of the paper examines several scenarii towards simulating from the tail region. For unbounded densities, a similarly minute analysis is undertaken, again with requests about the target like its algebraic order.

“…the result of division of a random integer by its range is a fixed-point number which unlike a floating-point number does not enjoy increased precision near 0. When such random numbers are used in the tail algorithm they cause premature termination of the tail and large gaps between produced random numbers near the termination point.”

The paper further discusses the correction of an error common to earlier ziggurat algorithms, due to the  conversion from fixed-point to floating-point numbers, as indicated in the above quote. Although this had already been addressed by George Marsaglia in the early 1990’s.

“Ziggurat algorithm has a high setup time, so it’s not suitable for applications that require variates with frequently changing shape parameters.”

When testing the algorithm against different methods (in STL and Boost), and different distributions, the gains are between two and seven times faster, except for the Exponential target where the original ziggurat algorithm performs better. Interestingly, the gains (and the computing time) increase with the degrees of freedom for the Gamma target, in relation with Devroye’s (1986) remark on the absence of uniformly bounded execution times for this distribution. Same thing for the Weibull variates, obviously. Reflecting upon the usually costly computation of cdfs and inverse cdfs on machines and software, the inverse cdf method is systematically left behind! In conclusion, a good Sunday morning read if not of direct consequences for MCMC implementation, as warned by the authors.

## independent random sampling methods [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on May 16, 2018 by xi'an Last week, I had the pleasant surprise to receive a copy of this book in the mail. Book that I was not aware had been written or published (meaning that I was not involved in its review!). The three authors, Luca Martino, David Luengo, and Joaquín Míguez, of Independent Random Sampling Methods are from Madrid universities and I have read (and posted on) several of their papers on (population) Monte Carlo simulation in the recent years. Including Luca’s survey of multiple try MCMC which was helpful in writing our WIREs own survey.

The book is a pedagogical coverage of most algorithms used to simulate independent samples from a given distribution, which of course recoups some of the techniques exposed with more details by [another] Luc, namely Luc Devroye’s Non-uniform random variate generation bible, often mentioned here (and studied in uttermost details by a dedicated reading group in Warwick). It includes a whole chapter on accept-reject methods, with in particular a section on Payne-Dagpunar’s band rejection I had not seen previously. And another entire chapter on ratio-of-uniforms techniques. On which the three authors had proposed generalisations [covered by the book], years before I attempted to go the same way, having completely forgotten reading their paper at the time… Or the much earlier 1991 paper by Jon Wakefield, Alan Gelfand and Adrian Smith!

The book also covers the “vertical density representation”, due to Troutt (1991), which consists in considering the distribution of the density p(.) of the random variable X as a random variable, p(X). I remember pondering about this alternative to the cdf transform and giving up on it as the outcome has a distribution depending on p, even when the density is monotonous. Even though I am not certain from reading the section that this is particularly appealing…

Given its title, the book contains very little about MCMC. Except for a last and final chapter that covers adaptive independent Metropolis-Hastings algorithms, in connection with some of the authors’ recent work. Like multiple try Metropolis. Relating to the (unidimensional) ARMS “ancestor” of adaptive MCMC methods. (As noted in a recent blog on Holden et al., 2009 , I have trouble understanding how recycling only rejected proposed values to build a better proposal distribution is enough to guarantee convergence of an adaptive algorithm, but the book does not delve much into this convergence.)

All in all and with the bias induced by me working in the very area, I find the book quite a nice entry on the topic, which can be used in a Monte Carlo course at both undergraduate and graduate levels if one want to avoid going into Markov chains. It is certainly less likely to scare students away than the comprehensive Non-uniform random variate generation and on the opposite may induce some of them to pursue a research career in this domain.

## certified randomness, 187m away…

Posted in Statistics with tags , , , , , , , on May 3, 2018 by xi'an  As it rarely happens with Nature, I just read an article that directly relates to my research interests, about a secure physical random number generator (RNG). By Peter Bierhost and co-authors, mostly physicists apparently. Security here means that the outcome of the RNG is unpredictable. This very peculiar RNG is based on two correlated photons sent to two measuring stations, separated by at least 187m, which have to display unpredictable outcomes in order to respect the impossibility of faster-than-light communications, otherwise known as Bell inequalities. This is hardly practical though, especially when mentioning that the authors managed to produce 2¹⁰ random bits over 10 minutes, post processing “the measurement of 55 million photon pairs”. (I however fail to see why the two-arm apparatus would be needed for regular random generation as it seems relevant solely for the demonstration of randomness.) I also checked the associated supplementary material, which is mostly about proving some total variation bound, and constructing a Bell function. What is most puzzling in this paper (and the associated supplementary material) is the (apparent) lack of guarantee of uniformity of the RNG. For instance, a sentence (Supplementary Material, p.11) about  a distribution being “within TV distance of uniform” hints at the method being not provably uniform, which makes the whole exercise incomprehensible…

## complexity of the von Neumann algorithm

Posted in Statistics with tags , , , , , , , , , on April 3, 2017 by xi'an “Without the possibility of computing infimum and supremum of the density f over compact subintervals of the domain of f, sampling absolutely continuous distribution using the rejection method seems to be impossible in total generality.”

The von Neumann algorithm is another name for the rejection method introduced by von Neumann circa 1951. It was thus most exciting to spot a paper by Luc Devroye and Claude Gravel appearing in the latest Statistics and Computing. Assessing the method in terms of random bits and precision. Specifically, assuming that the only available random generator is one of random bits, which necessarily leads to an approximation when the target is a continuous density. The authors first propose a bisection algorithm for distributions defined on a compact interval, which compares random bits with recursive bisections of the unit interval and stops when the interval is small enough. In higher dimension, for densities f over the unit hypercube, they recall that the original algorithm consisted in simulating uniforms x and u over the hypercube and [0,1], using the uniform as the proposal distribution and comparing the density at x, f(x), with the rescaled uniform. When using only random bits, the proposed method is based on a quadtree that subdivides the unit hypercube into smaller and smaller hypercubes until the selected hypercube is entirely above or below the density. And is small enough for the desired precision. This obviously requires for the computation of the upper and lower bound of the density over the hypercubes to be feasible, with Devroye and Gravel considering that this is a necessary property as shown by the above quote. Densities with non-compact support can be re-expressed as densities on the unit hypercube thanks to the cdf transform. (Actually, this is equivalent to the general accept-reject algorithm, based on the associated proposal.)

“With the oracles introduced in our modification of von Neumann’s method, we believe that it is impossible to design a rejection algorithm for densities that are not Riemann-integrable, so the question of the design of a universally valid rejection algorithm under the random bit model remains open.”

In conclusion, I enjoyed very much reading this paper, especially the reflection it proposes on the connection between Riemann integrability and rejection algorithms. (Actually, I cannot think straight away of a simulation algorithm that would handle non-Riemann-integrable densities, apart from nested sampling. Or of significant non-Riemann-integrable densities.)

## ratio-of-uniforms [#4]

Posted in Books, pictures, R, Statistics, University life with tags , , , , on December 2, 2016 by xi'an Possibly the last post on random number generation by Kinderman and Monahan’s (1977) ratio-of-uniform method. After fiddling with the Gamma(a,1) distribution when a<1 for a while, I indeed figured out a way to produce a bounded set with this method: considering an arbitrary cdf Φ with corresponding pdf φ, the uniform distribution on the set Λ of (u,v)’s in R⁺xX such that

0≤u≤Φοƒ[φοΦ⁻¹(u)v]

induces the distribution with density proportional to ƒ on φοΦ⁻¹(U)V. This set Λ has a boundary that is parameterised as

u=Φοƒ(x),  v=1/φοƒ(x), x∈Χ

which remains bounded in u since Φ is a cdf and in v if φ has fat enough tails. At both 0 and ∞. When ƒ is the Gamma(a,1) density this can be achieved if φ behaves like log(x)² near zero and like a inverse power at infinity. Without getting into all the gory details, closed form density φ and cdf Φ can be constructed for all a’s, as shown for a=½ by the boundaries in u and v (yellow) below which leads to a bounded associated set Λ At this stage, I remain uncertain of the relevance of such derivations, if only because the set A thus derived is ill-suited for uniform draws proposed on the enclosing square box. And also because a Gamma(a,1) simulation can rather simply be derived from a Gamma(a+1,1) simulation. But, who knows?!, there may be alternative usages of this representation, such as innovative slice samplers. Which means the ratio-of-uniform method may reappear on the ‘Og one of those days…

## ratio-of-uniforms [#3]

Posted in Books, pictures, R, Statistics with tags , , , , , on November 4, 2016 by xi'an Being still puzzled (!) by the ratio-of-uniform approach, mostly failing to catch its relevance for either standard distributions in a era when computing a cosine or an exponential is negligible, or non-standard distributions for which computing bounds and boundaries is out-of-reach, I kept searching for solutions that would include unbounded densities and still produce compact boxes, as this seems essential for accept-reject simulation if not for slice sampling. And after exploring some dead-ends (in tune with running in Venezia!), I came upon the case of the generalised logistic transform $h(\omega)=\omega^a/(1+\omega^a)$

which ensures that the [ratio-of-almost-uniform] set I defined in my slides last week $\mathfrak{H}=\left\{(u,v);\ 0\le u\le h(f(v/g(u))\right\}$

is bounded in u. Since the transform g is the derivative of the inverse of h (!), $g(y)=a^{-1}y^{(1-a)/a}/(1-y)^{(1-3a)/a}$

the parametrisation of the boundary of H is $u(x)=f(x)^a/(1+f(x)^a)\ v(x)=a^{-1}xf(x)^{(a-1)/a}(1+f(x)^a)^2$

which means it remains bounded if (a) a≤1 [to ensure boundedness at infinity] and (b) the limit of v(x) at zero [where I assume the asymptote stands] is bounded. Meaning $\lim_{x\to 0} xf(x)^{2a+1/a-1}<\infty$

Working a wee bit more on the problem led me to realise that resorting to an arbitrary cdf Φ instead of the logistic cdf could solve the problem for most distributions, including all Gammas! Indeed, the boundary of H is now $u(x)=\Phi(f(x))^a\ v(x)=a^{-1}xf(x)^{(a-1)/a}/\varphi(f(x))$

which means it remains bounded if φ has very heavy tails, like 1/x². To handle the explosion when x=0. And an asymptote itself at zero, to handle the limit at infinity when f(x) goes to zero.

## ratio-of-uniforms

Posted in Books, pictures, R, Statistics with tags , , , , on October 24, 2016 by xi'an One approach to random number generation that had always intrigued me is Kinderman and Monahan’s (1977) ratio-of-uniform method. The method is based on the result that the uniform distribution on the set A of (u,v)’s in R⁺xX such that

0≤u²≤ƒ(v/u)

induces the distribution with density proportional to ƒ on V/U. Hence the name. The proof is straightforward and the result can be seen as a consequence of the fundamental lemma of simulation, namely that simulating from the uniform distribution on the set B of (w,x)’s in R⁺xX such that

0≤w≤ƒ(x)

induces the marginal distribution with density proportional to ƒ on X. There is no mathematical issue with this result, but I have difficulties with picturing the construction of efficient random number generators based on this principle. I thus took the opportunity of the second season of [the Warwick reading group on] Non-uniform random variate generation to look anew at this approach. (Note that the book is freely available on Luc Devroye’s website.) The first thing I considered is the shape of the set A. Which has nothing intuitive about it! Luc then mentions (p.195) that the boundary of A is given by

u(x)=√ƒ(x),v(x)=x√ƒ(x)

which then leads to bounding both ƒ and x→x²ƒ(x) to create a box around A and an accept-reject strategy, but I have trouble with this result without making further assumptions about ƒ… Using a two component normal mixture as a benchmark, I found bounds on u(.) and v(.) and simulated a large number of points within the box to end up with the above graph that indeed the accepted (u,v)’s were within this boundary. And the same holds with a more ambitious mixture: 