## perfect sampling, just perfect!

Posted in Books, Statistics, University life with tags , , , , , , , , on January 19, 2016 by xi'an

Great news! Mark Huber (whom I’ve know for many years, so this review may be not completely objective!) has just written a book on perfect simulation! I remember (and still share) the excitement of the MCMC community when the first perfect simulation papers of Propp and Wilson (1995) came up on the (now deceased) MCMC preprint server, as it seemed then the ideal (perfect!) answer to critics of the MCMC methodology, plugging MCMC algorithms into a generic algorithm that eliminating burnin, warmup, and convergence issues… It seemed both magical, with the simplest argument: “start at T=-∞ to reach stationarity at T=0”, and esoteric (“why forward fails while backward works?!”), requiring simple random walk examples (and a java app by Jeff Rosenthal) to understand the difference (between backward and forward), as well as Wilfrid Kendall’s kids’ coloured wood cubes and his layer of leaves falling on the ground and seen from below… These were exciting years, with MCMC still in its infancy, and no goal seemed too far away! Now that years have gone, and that the excitement has clearly died away, perfect sampling can be considered in a more sedate manner, with pros and cons well-understood. This is why Mark Huber’s book is coming at a perfect time if any! It covers the evolution of the perfect sampling techniques, from the early coupling from the past to the monotonous versions, to the coalescence principles, with applications to spatial processes, to the variations on nested sampling and their use in doubly intractable distributions, with forays into the (fabulous) Bernoulli factory problem (a surprise for me, as Bernoulli factories are connected with unbiasedness, not stationarity! Even though my only fieldwork [with Randal Douc] in such factories was addressing a way to turn MCMC into importance sampling. The key is in the notion of approximate densities, introduced in Section 2.6.). The book is quite thorough with the probabilistic foundations of the different principles, with even “a [tiny weeny] little bit of measure theory.

Any imperfection?! Rather, only a (short too short!) reflection on the limitations of perfect sampling, namely that it cannot cover the simulation of posterior distributions in the Bayesian processing of most statistical models. Which makes the quote

“Distributions where the label of a node only depends on immediate neighbors, and where there is a chance of being able to ignore the neighbors are the most easily handled by perfect simulation protocols (…) Statistical models in particular tend to fall into this category, as they often do not wish to restrict the outcome too severely, instead giving the data a chance to show where the model is incomplete or incorrect.” (p.223)

just surprising, given the very small percentage of statistical models which can be handled by perfect sampling. And the downsizing of perfect sampling related papers in the early 2000’s. Which also makes the final and short section on the future of perfect sampling somewhat restricted in its scope.

So, great indeed!, a close to perfect entry to a decade of work on perfect sampling. If you have not heard of the concept before, consider yourself lucky to be offered such a gentle guidance into it. If you have dabbled with perfect sampling before, reading the book will be like meeting old friends and hearing about their latest deeds. More formally, Mark Huber’s book should bring you a new perspective on the topic. (As for me, I had never thought of connecting perfect sampling with accept reject algorithms.)

## Handbook of Markov chain Monte Carlo

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , , , on September 22, 2011 by xi'an

At JSM, John Kimmel gave me a copy of the Handbook of Markov chain Monte Carlo, as I had not (yet?!) received it. This handbook is edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, all first-class jedis of the MCMC galaxy. I had not had a chance to get a look at the book until now as Jean-Michel Marin took it home for me from Miami, but, as he remarked in giving it back to me last week, the outcome truly is excellent! Of course, authors and editors being friends of mine, the reader may worry about the objectivity of this assessment; however the quality of the contents is clearly there and the book appears as a worthy successor to the tremendous Markov chain Monte Carlo in Practice by Wally Gilks, Sylvia Richardson and David Spiegelhalter. (I can attest to the involvement of the editors from the many rounds of reviews we exchanged about our MCMC history chapter!) The style of the chapters is rather homogeneous and there are a few R codes here and there. So, while I will still stick to our Monte Carlo Statistical Methods book for teaching MCMC to my graduate students next month, I think the book can well be used at a teaching level as well as a reference on the state-of-the-art MCMC technology. Continue reading

## Another Bernoulli factory

Posted in R, Statistics with tags , , on February 14, 2011 by xi'an

The paper “Exact sampling for intractable probability distributions via a Bernoulli factory” by James Flegal and Radu Herbei got posted on arXiv without me noticing, presumably because it came out just between Larry Brown’s conference in Philadelphia and my skiing vacations! I became aware of it only yesterday and find it quite interesting in that it links the Bernoulli factory method I discussed a while ago and my ultimate perfect sampling paper with Jim Hobert. In this 2004 paper in Annals of Applied Probability, we got a representation of the stationary distribution of a Markov chain as

$\sum_{n=1}^{\infty} p_n Q_n(dx)$

where

$p_n = \mathbb{P}(\tau\ge n)\qquad\text{and}\qquad Q_n(A)=\mathbb{P}(X_n\in A|\tau\ge n),$

the stopping time τ being the first occurrence of a renewal event in the split chain. While $Q_n$ is reasonably easy to simulate by rejection (even tohugh it may prove lengthy when n is large, simulating from the tail distribution of the stopping time is much harder. Continue reading

## Monte Carlo Statistical Methods third edition

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , , on September 23, 2010 by xi'an

Last week, George Casella and I worked around the clock on starting the third edition of Monte Carlo Statistical Methods by detailing the changes to make and designing the new table of contents. The new edition will not see a revolution in the presentation of the material but rather a more mature perspective on what matters most in statistical simulation: