As I was checking the recent stat postings on arXiv, I noticed the paper by Chen and Xie entitled inference in Kingman’s coalescent with pMCMC. (And surprisingly deposited in the machine learning subdomain.) The authors compare a pMCMC implementation for Kingman’s coalescent with importance sampling (à la Stephens & Donnelly), regular MCMC and SMC. The specifics of their pMCMC algorithm is that they simulate the coalescent times conditional on the tree structure and the tree structure conditional on the coalescent times (via SMC). The results reported in the paper consider up to five loci and agree with earlier experiments showing poor performances of MCMC algorithms (based on the LAMARC software and apparently using independent proposals). They show similar performances between importance sampling and pMCMC. While I find this application of pMCMC interesting, I wonder at the generality of the approach: when I was introduced to ABC techniques, the motivation was that importance sampling was deteriorating very quickly with the number of parameters. Here it seems the authors only considered one parameter θ. I wonder what happens when the number of parameters increases. And how pMCMC would then compare with ABC.
Archive for importance sampling
Today my student Xiaolin Cheng presented the mythical 1990 JASA paper of Alan Gelfand and Adrian Smith, Sampling-based approaches to calculating marginal densities. The very one that started the MCMC revolution of the 1990′s! Re-reading it through his eyes was quite enlightening, even though he stuck quite closely to the paper. (To the point of not running his own simulation, nor even reporting Gelfand and Smith’s, as shown by the slides below. This would have helped, I think…)
Indeed, those slides focus very much on the idea that such substitution samplers can provide parametric approximations to the marginal densities of the components of the simulated parameters. To the point of resorting to importance sampling as an alternative to the standard Rao-Blackwell estimate, a solution that did not survive long. (We briefly discussed this point during the seminar, as the importance function was itself based on a Rao-Blackwell estimate, with possibly tail issues. Gelfand and Smith actually conclude on the higher efficiency of the Gibbs sampler.) Maybe not so surprisingly, the approximation of the “other” marginal, namely the marginal likelihood, as it is much more involved (and would lead to the introduction of the infamous harmonic mean estimator a few years later! And Chib’s (1995), which is very close in spirit to the Gibbs sampler). While Xiaolin never mentioned Markov chains in his talk, Gelfand and Smith only report that Gibbs sampling is a Markovian scheme, and refer to both Geman and Geman (1984) and Tanner and Wong (1987), for convergence issues. Rather than directly invoking Markov arguments as in Tierney (1994) and others. A fact that I find quite interesting, a posteriori, as it highlights the strong impact Meyn and Tweedie would have, three years later.
As posted here a long, long while ago, following a suggestion from the editor (and North America Cycling Champion!) Pierre Lécuyer (Université de Montréal), Arnaud Doucet (University of Oxford) and myself acted as guest editors for a special issue of ACM TOMACS on Monte Carlo Methods in Statistics. (Coincidentally, I am attending a board meeting for TOMACS tonight in Berlin!) The issue is now ready for publication (next February unless I am confused!) and made of the following papers:
|* Massive parallelization of serial inference algorithms for a complex generalized linear model
MARC A. SUCHARD, IVAN ZORYCH, PATRICK RYAN, DAVID MADIGAN
| *Convergence of a Particle-based Approximation of the Block Online Expectation Maximization Algorithm
SYLVAIN LE CORFF and GERSENDE FORT
| * Efficient MCMC for Binomial Logit Models
AGNES FUSSL, SYLVIA FRÜHWIRTH-SCHNATTER, RUDOLF FRÜHWIRTH
| * Adaptive Equi-Energy Sampler: Convergence and Illustration
AMANDINE SCHRECK and GERSENDE FORT and ERIC MOULINES
| * Particle algorithms for optimization on binary spaces
| * Posterior expectation of regularly paved random histograms
RAAZESH SAINUDIIN, GLORIA TENG, JENNIFER HARLOW, and DOMINIC LEE
| * Small variance estimators for rare event probabilities
MICHEL BRONIATOWSKI and VIRGILE CARON
| * Self-Avoiding Random Dynamics on Integer Complex Systems
FIRAS HAMZE, ZIYU WANG, and NANDO DE FREITAS
| * Bayesian learning of noisy Markov decision processes
SUMEETPAL S. SINGH, NICOLAS CHOPIN, and NICK WHITELEY
Here is the draft of the editorial that will appear at the beginning of this special issue. (All faults are mine, of course!) Read more »
This afternoon, Jean-Michel Marin gave his talk at the big’MC seminar. As already posted, it was about a convergence proof for AMIS, which gave me the opportunity to simultaneously read the paper and listen to the author. The core idea for adapting AMIS towards a manageable version is to update the proposal parameter based on the current sample rather than on the whole past. This facilitates the task of establishing convergence to the optimal (pseudo-true) value of the parameter, under an assumption that the optimal value is a know moment of the target. From there, convergence of the weighted mean is somehow natural when the number of simulations grows to infinity. (Note the special asymptotics of AMIS, though, which are that the number of steps goes to infinity while the number of simulations per step grows a wee faster than linearly. In this respect, it is the opposite of PMC, where convergence is of a more traditional nature, pushing the number of simulations per step to infinity.) The second part of the convergence proof is more intricate, as it establishes that the multiple mixture estimator based on the “forward-backward” reweighting of all simulations since step zero does converge to the proper posterior moment. This relies on rather complex assumptions, but remains a magnificent tour de force. During the talk, I wondered if, given the Markovian nature of the algorithm (since reweighting only occurs once simulation is over), an alternative estimator based on the optimal value of the simulation parameter would not be better than the original multiple mixture estimator: the proof is based on the equivalence between both versions….
Another definitely interesting and intense day at the Confronting Intractability in Statistical Inference workshop in Bristol: all talks there had a high informational content for me and even those I had heard previously [in no time difference and hence much less chances of my dozing during talks, which, alas!, now gets into an almost certainty for US conferences!) For instance, I am still coming to terms with Gareth’s importance sampling for continuous diffusions. (This was the first time I was hearing Arnaud’s talk on the estimation of the score vector and I definitely to hear it again, given its technicality!) Sumeet Singh gave a talk mixing ABC with maximum likelihood estimation for HMMS, in connection with his earlier paper, and I got more convince by the idea of using a sequence of balls for keeping pseudo-data close to the true data when I realised it could be implemented sequentially. Nial Friel’s talk on the double intractable likelihoods was covering graphical models and social network models, maybe calling for a comparison with ABC, as done in the recent paper by Richard Everitt. I had too many slides and thus presumably failed to deliver an intelligible message about the selection of ABC summary statistics for testing, even though the population genetics new illustration presumably helped. In connection with our ABC paper, Dennis Prangle and Paul Fernhead presented a poster on using the Bayes factor as a summary statistics in this setup, in the spirit of their Read Paper of last December. And Richard Wilkinson concluded the day with a more philosophical talk on the dual nature of ABC inference, in a quite pleasant perspective (that related to the way ABC was received by econometricians during my talk in Princeton last week). The day ended up quite pleasantly in a south-Indian thali restaurant, a good preparation for Glasgow’s Ashoka tomorrow night!