Archive for split and merge moves

Particle Gibbs for conjugate mixture posteriors

Posted in Books, Statistics, University life with tags , , , , , on September 8, 2015 by xi'an

Alexandre Bouchard-Coté, Arnaud Doucet, and Andrew Roth have arXived a paper “Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models” that proposes an efficient algorithm to explore the posterior distribution of a mixture, when interpreted as a clustering model. (To clarify the previous sentence, this is a regular plain vanilla mixture model for which they explore the latent variable representation.)

I like very much the paper because it relates to an earlier paper of mine with George Casella and Marty Wells, paper we wrote right after a memorable JSM in Baltimore (during what may have been my last visit to Cornell University as George left for Florida the following year). The starting point of this approach to mixture estimation is that the (true) parameters of a mixture can be (exactly) integrated out when using conjugate priors and a completion step. Namely, the marginal posterior distribution of the latent variables given the data is available in closed form. The latent variables being the component allocations for the observations. The joint posterior is then a product of the prior on the parameters times the prior on the latents times a product of simple (e.g., Gaussian) terms. This equivalently means the marginal likelihoods conditional on the allocations are available in closed form. Looking directly at those marginal likelihoods, a prior distribution on the allocations can be introduced (e.g., the Pitman-Yor process or the finite Dirichlet prior) and, together, they define a closed form target. Albeit complex. As often on a finite state space. In our paper with George and Marty, we proposed using importance sampling to explore the set, using for instance marginal distributions for the allocations, which are uniform in the case of exchangeable priors, but this is not very efficient, as exhibited by our experiments where very few partitions would get most of the weight.

Even a Gibbs sampler on subsets of those indicators restricted to two components cannot be managed directly. The paper thus examines a specially designed particle Gibbs solution that implements a split and merge move on two clusters at a time. Merging or splitting the subset. With intermediate target distributions, SMC style. While this is quite an involved mechanism, that could be deemed as excessive for the problem at hand, as well as inducing extra computing time, experiments in the paper demonstrate the mostly big improvement in efficiency brought by this algorithm.

reversible jump on HMMs

Posted in Books, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , on December 19, 2011 by xi'an

Here is an email I received a few weeks ago about a paper written more than a decade ago in Glasgow with Tobias Rydén and Mike Titterington:

Sorry to bother you. I am a PhD student in economics. Recently, I am very interested in your paper “Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method”. I would like to use your method in estimating some regime-switching economic model. Unfortunately, I am not exactly understand your paper. Hence, I am writing to ask for your help. My questions are:

  1. A split or merge move is determined at the same time or sequentially? If the moves are determined at that same time, then accepting a split move implies that we can not accept a merge move any more in the same sweep. If the moves are determined sequentially, it means that we can accept a split move first, then accept a merge move in the same sweep. [Answer: First interpretation is correct. Except that the type of move is first selected at random, then only the corresponding move is generated and potentially accepted.]
  2.  In the paper, you discuss how to generate new transition probabilities in a split move in details. However, you did not discuss (probably, I am wrong) how to generate probabilities in each new state (series Zt in your paper).  Could you please tell me how to generate the series Zt? [Answer: check eqn (3).]
  3. My economic model is a multiple series (a vector hidden Markov model), will you refer me to some other papers for the vector model? [Answer: If the observed series is multidimensional, the extension is formally straightforward, if potentially prone to slow mixing and low acceptance rates. If the hidden Markov chain is multidimensional, I have not seen a version of reversible jump in this setting. Maybe an extension of the variational methods described in Ghahramani and Jordan would help.]

to which I replied that the questions showed a deep lack of understanding of what reversible jump is and that the PhD student should first check the literature, for instance the great intro paper by Charlie Geyer in Handbook of Markov chain Monte Carlo and then the original papers by Green (1995) and Richardson and Green (1997).

%d bloggers like this: