## LMS Invited Lecture Series and CRISM Summer School in Computational Statistics, just started!

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on July 9, 2018 by xi'an

## optimal Bernoulli factory

Posted in Statistics with tags , , , , , , , , , , on January 17, 2017 by xi'an

One of the last arXivals of the year was this paper by Luis Mendo on an optimal algorithm for Bernoulli factory (or Lovàsz‘s or yet Basu‘s) problems, i.e., for producing an unbiased estimate of f(p), 0<p<1, from an unrestricted number of Bernoulli trials with probability p of heads. (See, e.g., Mark Huber’s recent book for background.) This paper drove me to read an older 1999 unpublished document by Wästlund, unpublished because of the overlap with Keane and O’Brien (1994). One interesting gem in this document is that Wästlund produces a Bernoulli factory for the function f(p)=√p, which is not of considerable interest per se, but which was proposed to me as a puzzle by Professor Sinha during my visit to the Department of Statistics at the University of Calcutta. Based on his 1979 paper with P.K. Banerjee. The algorithm is based on a stopping rule N: throw a fair coin until the number of heads n+1 is greater than the number of tails n. The event N=2n+1 occurs with probability

${2n \choose n} \big/ 2^{2n+1}$

[Using a biased coin with probability p to simulate a fair coin is straightforward.] Then flip the original coin n+1 times and produce a result of 1 if at least one toss gives heads. This happens with probability √p.

Mendo generalises Wästlund‘s algorithm to functions expressed as a power series in (1-p)

$f(p)=1-\sum_{i=1}^\infty c_i(1-p)^i$

with the sum of the weights being equal to one. This means proceeding through Bernoulli B(p) generations until one realisation is one or a probability

$c_i\big/1-\sum_{j=1}^{i-1}c_j$

event occurs [which can be derived from a Bernoulli B(p) sequence]. Furthermore, this version achieves asymptotic optimality in the number of tosses, thanks to a form of Cramer-Rao lower bound. (Which makes yet another connection with Kolkata!)

## perfect sampling, just perfect!

Posted in Books, Statistics, University life with tags , , , , , , , , on January 19, 2016 by xi'an

Great news! Mark Huber (whom I’ve know for many years, so this review may be not completely objective!) has just written a book on perfect simulation! I remember (and still share) the excitement of the MCMC community when the first perfect simulation papers of Propp and Wilson (1995) came up on the (now deceased) MCMC preprint server, as it seemed then the ideal (perfect!) answer to critics of the MCMC methodology, plugging MCMC algorithms into a generic algorithm that eliminating burnin, warmup, and convergence issues… It seemed both magical, with the simplest argument: “start at T=-∞ to reach stationarity at T=0”, and esoteric (“why forward fails while backward works?!”), requiring simple random walk examples (and a java app by Jeff Rosenthal) to understand the difference (between backward and forward), as well as Wilfrid Kendall’s kids’ coloured wood cubes and his layer of leaves falling on the ground and seen from below… These were exciting years, with MCMC still in its infancy, and no goal seemed too far away! Now that years have gone, and that the excitement has clearly died away, perfect sampling can be considered in a more sedate manner, with pros and cons well-understood. This is why Mark Huber’s book is coming at a perfect time if any! It covers the evolution of the perfect sampling techniques, from the early coupling from the past to the monotonous versions, to the coalescence principles, with applications to spatial processes, to the variations on nested sampling and their use in doubly intractable distributions, with forays into the (fabulous) Bernoulli factory problem (a surprise for me, as Bernoulli factories are connected with unbiasedness, not stationarity! Even though my only fieldwork [with Randal Douc] in such factories was addressing a way to turn MCMC into importance sampling. The key is in the notion of approximate densities, introduced in Section 2.6.). The book is quite thorough with the probabilistic foundations of the different principles, with even “a [tiny weeny] little bit of measure theory.

Any imperfection?! Rather, only a (short too short!) reflection on the limitations of perfect sampling, namely that it cannot cover the simulation of posterior distributions in the Bayesian processing of most statistical models. Which makes the quote

“Distributions where the label of a node only depends on immediate neighbors, and where there is a chance of being able to ignore the neighbors are the most easily handled by perfect simulation protocols (…) Statistical models in particular tend to fall into this category, as they often do not wish to restrict the outcome too severely, instead giving the data a chance to show where the model is incomplete or incorrect.” (p.223)

just surprising, given the very small percentage of statistical models which can be handled by perfect sampling. And the downsizing of perfect sampling related papers in the early 2000’s. Which also makes the final and short section on the future of perfect sampling somewhat restricted in its scope.

So, great indeed!, a close to perfect entry to a decade of work on perfect sampling. If you have not heard of the concept before, consider yourself lucky to be offered such a gentle guidance into it. If you have dabbled with perfect sampling before, reading the book will be like meeting old friends and hearing about their latest deeds. More formally, Mark Huber’s book should bring you a new perspective on the topic. (As for me, I had never thought of connecting perfect sampling with accept reject algorithms.)