advanced Markov chain Monte Carlo methods
This book, Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples, by Faming Liang, Chuanhai Liu, and Raymond Carroll, appeared last year and has been sitting on my desk all this time, patiently (?) waiting for a review. When I received it, I took a brief look at it (further than the cool cover!) and then decided I needed more than that to write a useful review! Here are my impressions on Advanced Markov Chain Monte Carlo Methods after a deeper read. (I have not read any other review in the main statistical journals so far.)
The title, Advanced Markov Chain Monte Carlo Methods, is a clear warning on the level of the book: “advanced”, it certainly is!!! By page 85, the general description of MCMC simulation methods is completed, including perfect sampling and reversible jump MCMC, and the authors engage into a detailed description of highly specialised topics of their choice: Auxiliary variables (Chap. 4), Population-based MCMC (Chap. 5), Dynamic weighting (Chap. 6), Stochastic approximation Monte Carlo (Chap. 7), and MCMC with adaptive proposals (Chap. 8). The book is clearly inspired by the numerous papers the authors have written in those area, especially Faming Liang. (The uneven distribution of the number of citations per year with peaks in 2000 and 2009 reflects this strong connection.) While the book attempts at broadening the spectrum by including introductory sections, and discussing other papers, it remains nonetheless that this centred focus of the book reduces its potential readership to graduate students and researchers who could directly work on the original papers. I would thus hesitate in teaching my graduate students from this book, given that they only attend a single course on Monte Carlo methods.
“Bayesian analysis for scientific inference does not end with posterior derivation and computation. It is thus critical for posterior distributions to have clear interpretation. For the sake of clarity, probability used in this book has a long-run frequency interpretation in repeated experiments.” Liang, Liu and Carroll, Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples, p.4
The first chapter is an introduction to Bayesian inference and MCMC methods. It is necessarily very condensed (in 24 pages) but I do not see the point of the warning quoted above. Even though Monte Carlo methods customarily have a frequentist flavour, there is little motive for separating between the different interpretations of probability in a simulation book. (I am even uncertain it should be done for Bayesian statistics books.) This first chapter also includes a section on the ratio-of-uniforms method (which always intrigues me!), and a very short one on perfect sampling, which strikes me as being introduced too early within the book, since the authors have yet defined MCMC methods. The second chapter is about Gibbs sampling, “the most popular computational method for Bayesian inference”(p.27), moving rather quickly to advanced topics like parameter-expanded data augmentation and alternating subspace-spanning resampling, which reminded of the recent talk by Xiao-Li Meng in Paris. Chapter 3 motivates the Metropolis-Hastings algorithm as able to handle varying dimension problems, introducing regular versions of the algorithm as well the Langevin extension (but not the Hamiltonian) and Green’s (1995) reversible jump algorithm, before spending the third half of the chapter on a ChIP-chip data analysis of Wu, Liang and Tan (2009).
Chapter 4 covers auxiliary variable methods such as simulated annealing and tempering, slice sampling, and then turns to specific methods like Swendsen-Wang’s, Wolff’s, Møller’s, the exchange, and the double MH algorithms . The chapter concludes with a presentation of the recent “Monte Carlo MH” algorithm of Liang and Jin (2010, unpublished) that relies on a bridge sampling estimate for a ratio of intractable normalising constants. This is connected with the recent paper of Andrieu and Roberts (Annals of Statistics, 2009) [even though I am not sure the later applies to all versions of Monte Carlo MH, given that some versions use the inverse of an unbiased estimator of the ratio of the normalising constants]. Chapter 5 is about population-based MCMC methods, “where a population of Markov chains are run in parallel, each equipped with possibly different but related invariant distributions” (p.123). I think the recent (JRSS B) Read Paper by Andrieu, Doucet and Hollenstein would have fitted the goal quite naturally, although it may have come to the knowledge of the authors too late wrt the completion of the book. [However, it is surprising that neither Doucet nor Del Moral are ever mentioned throughout the book, given their corpus of work in the area, especially in sequential importance sampling, covered here and in the following chapter.] Chapter 6 is centred on dynamic weighting, a technique introduced by Wong and Liang (PNAS, 1997) and Liang (JASA, 2002) to overcome long exit time out of local maxima. I have always found the idea of replacing regular weights by random unbiased weights interesting, even though I never went all the way to the implementation of the principle. I also was puzzled by conditions like (6.8) about the existence of a function cx such that, for all measurable sets A,
difficult to fathom apart from constant functions. The end of the chapter covers a (recent) special dynamic weighting method of Liang and Cheon (2009) for handling unknown normalising constants. It appears to be more of a sequential importance sampling technique, hence relates to our population Monte Carlo construct, as well as Del Moral, Doucet and Jasra’s Sequential Monte Carlo samplers.
Chapter 7 is the largest chapter of the book, with more than one hundred pages, dedicated to stochastic approximation Monte Carlo (SAMC), following an algorithm introduced by the three authors, Liang, Liu and Carroll (JASA, 2007). This chapter also covers physics-inspired algorithms like multi-canonical Monte Carlo, ensemble sampling and the Wang-Landau algorithm (whose validation was recently addressed by Pierre Jacob and Robin Ryder), as SAMC is proposed as a specific way to implement the Wang-Landau algorithm. The coverage includes recent applications by the authors in Bayesian phylogeny analysis, Bayesian networks (cool graph on p.249!), and equally recent convergence results (whose 27 pages long level of detail may be excessive for the general readership, who could as well check the corresponding research paper). The final chapter (Chapter 8) is about adaptive proposal MCMC, recalling the results of Roberts and Rosenthal (Journal of Applied Probability, 2007) and Andrieu and Thoms (Statistics and Computing, 2008) on diminishing adaptation. The chapter also covers the recent paper of Holden, Hauge and Holden (Annals of Applied Probability, 2009), a paper I had not seen earlier but which seems “too good to be true” in that it brings hardly any constraint on the adaptive scheme… (The section on regeneration is missing some of the work in the area by Jim Hobert, Galin Jones and co-authors, like their 2002 Biometrika paper.)