As I was checking the recent stat postings on arXiv, I noticed the paper by Chen and Xie entitled inference in Kingman’s coalescent with pMCMC. (And surprisingly deposited in the machine learning subdomain.) The authors compare a pMCMC implementation for Kingman’s coalescent with importance sampling (à la Stephens & Donnelly), regular MCMC and SMC. The specifics of their pMCMC algorithm is that they simulate the coalescent times conditional on the tree structure and the tree structure conditional on the coalescent times (via SMC). The results reported in the paper consider up to five loci and agree with earlier experiments showing poor performances of MCMC algorithms (based on the LAMARC software and apparently using independent proposals). They show similar performances between importance sampling and pMCMC. While I find this application of pMCMC interesting, I wonder at the generality of the approach: when I was introduced to ABC techniques, the motivation was that importance sampling was deteriorating very quickly with the number of parameters. Here it seems the authors only considered one parameter θ. I wonder what happens when the number of parameters increases. And how pMCMC would then compare with ABC.
Archive for simulation
Indeed, I liked the i-like workshop very much. Among the many interesting talks of the past two days (incl. Cristiano Varin’s ranking of Series B as the top influential stat. journal!) , Matti Vihola’s and Nicolas Chopin’s had the strongest impact on me (to the point of scribbling in my notebook). In a joint work with Christophe Andrieu, Matti focussed on evaluating the impact of replacing the target with an unbiased estimate in a Metropolis-Hastings algorithm. In particular, they found necessary and sufficient conditions for keeping geometric and uniform ergodicity. My question (asked by Iain Murray) was whether they had derived ways of selecting the number of terms in the unbiased estimator towards maximal efficiency. I also wonder if optimal reparameterisations can be found in this sense (since unbiased estimators remain unbiased after reparameterisation).
Nicolas’ talk was about particle Gibbs sampling, a joint paper with Sumeet Singh recently arXived. I did not catch the whole detail of their method but/as I got intrigued by a property of Marc Beaumont’s algorithm (the very same algorithm used by Matti & Christophe). Indeed, the notion is that an unbiased estimator of the target distribution can be found in missing variable settings by picking an importance sampling distribution q on those variables. This representation leads to a pseudo-target Metropolis-Hastings algorithm. In the stationary regime, there exists a way to derive an “exact” simulation from the joint posterior on (parameter,latent). All the remaining/rejected latents are then distributed from the proposal q. What I do not see is how this impacts the next MCMC move since it implies generating a new sample of latent variables. I spoke with Nicolas about this over breakfast: the explanation is that this re-generated set of latent variables can be used in the denominator of the Metropolis-Hastings acceptance probability and is validated as a Gibbs step. (Incidentally, it may be seen as a regeneration event as well.)
Furthermore, I had a terrific run in the rising sun (at 5am) all the way to Kenilworth where I was a deer, pheasants and plenty of rabbits. (As well as this sculpture that now appears to me as being a wee sexist…)
Another paper recently arXived by Beaujean and Caldwell elaborated on our population Monte Carlo papers (Cappé et al., 2005, Douc et al., 2007, Wraith et al., 2010) to design a more thorough starting distribution. Interestingly, the authors mention the fact that PMC is an EM-type algorithm to emphasize the importance of the starting distribution, as with “poor proposal, PMC fails as proposal updates lead to a consecutively poorer approximation of the target” (p.2). I had not thought of this possible feature of PMC, which indeed proceeds along integrated EM steps, and thus could converge to a local optimum (if not poorer than the start as the Kullback-Leibler divergence decreases).
The solution proposed in this paper is similar to the one we developed in our AMIS paper. An important part of the simulation is dedicated to the construction of the starting distribution, which is a mixture deduced from multiple Metropolis-Hastings runs. I find the method spends an unnecessary long time on refining this mixture by culling the number of components: down-the-shelf clustering techniques should be sufficient, esp. if one considers that the value of the target is available at every simulated point. This has been my pet (if idle) theory for a long while: we do not take (enough) advantage of this informative feature in our simulation methods… I also find the Student’s t versus Gaussian kernel debate (p.6) somehow superfluous: as we shown in Douc et al., 2007, we can process Student’s t distributions so we can as well work with those. And rather worry about the homogeneity assumption this choice implies: working with any elliptically symmetric kernel assumes a local Euclidean structure on the parameter space, for all components, and does not model properly highly curved spaces. Another pet theory of mine’s. As for picking the necessary number of simulations at each PMC iteration, I would add to the ESS and the survival rate of the components a measure of the Kullback-Leibler divergence, as it should decrease at each iteration (with an infinite number of particles).
Another interesting feature is in the comparison with Multinest, the current version of nested sampling, developed by Farhan Feroz. This is the second time I read a paper involving nested sampling in the past two days. While this PMC implementation does better than nested sampling on the examples processed in the paper, the Multinest outcome remains relevant, particularly because it handles multi-modality fairly well. The authors seem to think parallelisation is an issue with nested sampling, while I do see why: at the most naïve stage, several nested samplers can be run in parallel and the outcomes pulled together.
More exciting news about MCMSki IV!
First thing first, the 16 contributed sessions are now all-set, having gotten the stamp of approval from the scientific committee! Thanks to everyone who submitted a session proposal. (There were so many proposals that we alas had to reject some, as well as every single talk proposal… Sorry people: we hope to hear about your research advances via your posters!) See the MCMSki IV website for the whole list. Apart from the plenary lectures, and the round table on software held on the second evening, there will be three parallel sessions on the remaining three slots for each day of the conference, which means 25 sessions total!
Second, the “call for posters” is open, simply meaning that anyone wishing to present a poster at MCMSki IV on Monday evening (or Tuesday night if we cannot accommodate all posters within a single evening!) is welcome to do so! This will take place in the conference centre as well (with an open bar to keep up with traditions) To this effect, if you intend to present a poster, (a) tick the box in the registration form and (b) …wait for further instructions on the MCMSki IV website about sending your abstract as we are trying to find an easy way to store and publish posters there. Simple as AB(C)!
Last, the registration page is now open! So fell free to register at your earliest convenience. The deadline for early bird registration is October 15, 2013 however hotel rooms are likely to vanish much earlier than that, leaving you on your own to find accommodation in Chamonix (not such a terrible task, actually!)