Archive for MCMC

independent random sampling methods [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on May 16, 2018 by xi'an

Last week, I had the pleasant surprise to receive a copy of this book in the mail. Book that I was not aware had been written or published (meaning that I was not involved in its review!). The three authors, Luca Martino, David Luengo, and Joaquín Míguez, of Independent Random Sampling Methods are from Madrid universities and I have read (and posted on) several of their papers on (population) Monte Carlo simulation in the recent years. Including Luca’s survey of multiple try MCMC which was helpful in writing our WIREs own survey.

The book is a pedagogical coverage of most algorithms used to simulate independent samples from a given distribution, which of course recoups some of the techniques exposed with more details by [another] Luc, namely Luc Devroye’s Non-uniform random variate generation bible, often mentioned here (and studied in uttermost details by a dedicated reading group in Warwick). It includes a whole chapter on accept-reject methods, with in particular a section on Payne-Dagpunar’s band rejection I had not seen previously. And another entire chapter on ratio-of-uniforms techniques. On which the three authors had proposed generalisations [covered by the book], years before I attempted to go the same way, having completely forgotten reading their paper at the time… Or the much earlier 1991 paper by Jon Wakefield, Alan Gelfand and Adrian Smith!

The book also covers the “vertical density representation”, due to Troutt (1991), which consists in considering the distribution of the density p(.) of the random variable X as a random variable, p(X). I remember pondering about this alternative to the cdf transform and giving up on it as the outcome has a distribution depending on p, even when the density is monotonous. Even though I am not certain from reading the section that this is particularly appealing…

Given its title, the book contains very little about MCMC. Except for a last and final chapter that covers adaptive independent Metropolis-Hastings algorithms, in connection with some of the authors’ recent work. Like multiple try Metropolis. Relating to the (unidimensional) ARMS “ancestor” of adaptive MCMC methods. (As noted in a recent blog on Holden et al., 2009 , I have trouble understanding how recycling only rejected proposed values to build a better proposal distribution is enough to guarantee convergence of an adaptive algorithm, but the book does not delve much into this convergence.)

All in all and with the bias induced by me working in the very area, I find the book quite a nice entry on the topic, which can be used in a Monte Carlo course at both undergraduate and graduate levels if one want to avoid going into Markov chains. It is certainly less likely to scare students away than the comprehensive Non-uniform random variate generation and on the opposite may induce some of them to pursue a research career in this domain.

accelerating MCMC

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on April 11, 2018 by xi'an

As forecasted a rather long while ago (!), I wrote a short and incomplete survey on some approaches to accelerating MCMC. With the massive help of Victor Elvira (Lille), Nick Tawn (Warwick) and Changye Wu (Dauphine). Survey which current version just got arXived and which has now been accepted by WIREs Computational Statistics. The typology (and even the range of methods) adopted here is certainly mostly arbitrary, with suggestions for different divisions made by a very involved and helpful reviewer. While we achieved a quick conclusion to the review process, suggestions and comments are most welcome! Even if we cannot include every possible suggestion, just like those already made on X validated. (WIREs stands for Wiley Interdisciplinary Reviews and its dozen topics cover several fields, from computational stats to biology, to medicine, to engineering.)

MCMC with multiple tries

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , on April 5, 2018 by xi'an

Earlier this year, Luca Martino wrote and arXived a review on multiple try MCMC. As its name suggests, the starting point of this algorithm is to propose N potential moves simultaneously instead of one, possibly according to N different proposal (conditional) densities, and to select one by a normalised importance sampling weight. The move is accepted by a Metropolis-Hastings step based on the ratio of the normalisation constants [at the current and at the one-before-current stages]. Besides the cost of computing the summation and generating the different variates, this method also faces the drawback of requiring N-1 supplementary simulations that are only used for achieving detailed balance and computing a backward summation of importance weights. (A first section of the review is dedicated to independent Metropolis-Hastings proposals, q(θ), which make life simpler, but are less realistic in my opinion since some prior knowledge or experimentation is necessary to build a relevant distribution q(θ).) An alternative covered in the survey is ensemble Monte Carlo (Neal, 2011), which produces a whole sample at each iteration, with target the product of the initial targets. This reminded me of our pinball sampler, which aimed at producing a spread-out sample while keeping the marginal correct. Although the motivation sounds closer to a particle sampler. Especially with this associated notion of an empirical approximation of the target. The next part of the review is about delayed rejection, which is a natural alternative approach to speeding up MCMC by considering several possibilities, if sequentially. Started in Antonietta Mira‘s 1999 PhD thesis. The difficulty with this approach is that the acceptance probability gets increasingly complex as the number of delays grows, which may annihilate its appeal relative to simultaneous multiple tries.

amazing appendix

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , on February 13, 2018 by xi'an

In the first appendix of the 1995 Statistical Science paper of Besag, Green, Higdon and Mengersen, on MCMC, “Bayesian Computation and Stochastic Systems”, stands a fairly neat result I was not aware of (and which Arnaud Doucet, with his unrivalled knowledge of the literature!, pointed out to me in Oxford, avoiding me the tedium to try to prove it afresco!). I remember well reading a version of the paper in Fort Collins, Colorado, in 1993 (I think!) but nothing about this result.

It goes as follows: when running a Metropolis-within-Gibbs sampler for component x¹ of a collection of variates x¹,x²,…, thus aiming at simulating from the full conditional of x¹ given x⁻¹ by making a proposal q(x|x¹,x⁻¹), it is perfectly acceptable to use a proposal that depends on a parameter α (no surprise so far!) and to generate this parameter α anew at each iteration (still unsurprising as α can be taken as an auxiliary variable) and to have the distribution of this parameter α depending on the other variates x²,…, i.e., x⁻¹. This is the surprising part, as adding α as an auxiliary variable was messing up the update of x⁻¹. But the proof as found in the 1995 paper [page 35] does not require to consider α as such as it establishes global balance directly. (Or maybe still detailed balance when writing the whole Gibbs sampler as a cycle of Metropolis steps.) Terrific! And a whiff mysterious..!

Better together in Kolkata [slides]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on January 4, 2018 by xi'an

Here are the slides of the talk on modularisation I am giving today at the PC Mahalanobis 125 Conference in Kolkata, mostly borrowed from Pierre’s talk at O’Bayes 2018 last month:

[which made me realise Slideshare has discontinued the option to update one’s presentation, forcing users to create a new presentation for each update!] Incidentally, the amphitheatre at ISI is located right on top of a geological exhibit room with a reconstituted Barapasaurus tagorei so I will figuratively ride a dinosaur during my talk!

two Parisian talks by Pierre Jacob in January

Posted in pictures, Statistics, University life with tags , , , , , , , , , on December 21, 2017 by xi'an

While back in Paris from Harvard in early January, Pierre Jacob will give two talks on works of his:

January 09, 10:30, séminaire d’Analyse-Probabilités, Université Paris-Dauphine: Unbiased MCMC

Markov chain Monte Carlo (MCMC) methods provide consistent approximations of integrals as the number of iterations goes to infinity. However, MCMC estimators are generally biased after any fixed number of iterations, which complicates both parallel computation and the construction of confidence intervals. We propose to remove this bias by using couplings of Markov chains and a telescopic sum argument, inspired by Glynn & Rhee (2014). The resulting unbiased estimators can be computed independently in parallel, and confidence intervals can be directly constructed from the Central Limit Theorem for i.i.d. variables. We provide practical couplings for important algorithms such as the Metropolis-Hastings and Gibbs samplers. We establish the theoretical validity of the proposed estimators, and study their variances and computational costs. In numerical experiments, including inference in hierarchical models, bimodal or high-dimensional target distributions, logistic regressions with the Pólya-Gamma Gibbs sampler and the Bayesian Lasso, we demonstrate the wide applicability of the proposed methodology as well as its limitations. Finally, we illustrate how the proposed estimators can approximate the “cut” distribution that arises in Bayesian inference for misspecified models.

January 11, 10:30, CREST-ENSAE, Paris-Saclay: Better together? Statistical learning in models made of modules [Warning: Paris-Saclay is not in Paris!]

In modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference or decision problem. It is convenient to use a graphical model to represent the statistical dependencies, via a set of connected “modules”, each relating to a specific data modality, and drawing on specific domain expertise in their development. In principle, given data, the conventional statistical update then allows for coherent uncertainty quantification and information propagation through and across the modules. However, misspecification of any module can contaminate the update of others. In various settings, particularly when certain modules are trusted more than others, practitioners have preferred to avoid learning with the full model in favor of “cut distributions”. In this talk, I will discuss why these modular approaches might be preferable to the full model in misspecified settings, and propose principled criteria to choose between modular and full-model approaches. The question is intertwined with computational difficulties associated with the cut distribution, and new approaches based on recently proposed unbiased MCMC methods will be described.

Long enough after the New Year festivities (if any) to be fully operational for them!

journée algorithmes stochastiques à Dauphine vendredi

Posted in Statistics, University life with tags , , , , , , , , , , on November 28, 2017 by xi'an

A final reminder (?) that we hold a special day of five talks around stochastic algorithms at Dauphine this Friday, from 10:00 till 17:30. Attendance is free, coffee and tea are free (while they last!), come and join us!