**Y**et another question on X validated reminded me of a discussion I had once with Jay Kadane when visiting Carnegie Mellon in Pittsburgh. Namely the fundamentally ill-posed nature of conjugate priors. Indeed, when considering the definition of a conjugate family as being a parameterised family Þ of distributions over the parameter space Θ stable under transform to the posterior distribution, this property is completely dependent (if there is such a notion as completely dependent!) on the dominating measure adopted on the parameter space Θ. Adopted is the word as there is no default, reference, natural, &tc. measure that promotes one specific measure on Θ as being *the* dominating measure. This is a well-known difficulty that also sticks out in most “objective Bayes” problems, as well as with maximum entropy priors. This means for instance that, while the Gamma distributions constitute a conjugate family for a Poisson likelihood, so do the truncated Gamma distributions. And so do the distributions which density (against a Lebesgue measure over an arbitrary subset of (0,∞)) is the product of a Gamma density by an arbitrary function of θ. I readily acknowledge that the standard conjugate priors as introduced in every Bayesian textbook are standard because they facilitate (to a certain extent) posterior computations. But, just like there exist an infinity of MaxEnt priors associated with an infinity of dominating measures, there exist an infinity of conjugate families, once more associated with an infinity of dominating measures. And the fundamental reason is that the sampling model (which induces the shape of the conjugate family) does not provide a measure on the parameter space Θ.

## Archive for conjugate priors

## dominating measure

Posted in Books, pictures, Statistics, Travel, University life with tags Bayesian textbook, Carnegie Mellon University, conjugate priors, cross validated, dominating measure, Jay Kadane, Pittsburgh, posterior distribution on March 21, 2019 by xi'an## I’m getting the point

Posted in Statistics with tags Bayesian statistics, Bayesian textbook, conjugate priors, cross validated, final exam, StackExchange, teaching on February 14, 2019 by xi'an**A** long-winded X validated discussion on the [textbook] mean-variance conjugate posterior for the Normal model left me [mildly] depressed at the point and use of answering questions on this forum. Especially as it came at the same time as a catastrophic outcome for my mathematical statistics exam. Possibly an incentive to quit X validated as one quits smoking, although this is not the first attempt…

## inverse stable priors

Posted in Statistics with tags All the pretty horses, alpha-stable processes, conjugate priors, Gösta Mittag-Leffler, non-informative priors, reference priors, Sofia Kovalevskaya on November 24, 2017 by xi'an**D**exter Cahoy and Joseph Sedransk just arXived a paper on so-called inverse stable priors. The starting point is the supposed defficiency of Gamma conjugate priors, which have explosive behaviour near zero. Albeit remaining proper. (This behaviour eventually vanishes for a large enough sample size.) The alternative involves a transform of alpha-stable random variables, with the consequence that the density of this alternative prior does not have a closed form. Neither does the posterior. When the likelihood can be written as exp(a.θ+b.log θ), modulo a reparameterisation, which covers a wide range of distributions, the posterior can be written in terms of the inverse stable density and of another (intractable) function called the generalized Mittag-Leffler function. (Which connects this post to an earlier post on Sofia Kovaleskaya.) For simulating this posterior, the authors suggest using an accept-reject algorithm based on the prior as proposal, which has the advantage of removing the intractable inverse stable density but the disadvantage of… simulating from the prior! (No mention is made of the acceptance rate.) I am thus reserved as to how appealing this new proposal is, despite “the inverse stable density (…) becoming increasingly popular in several areas of study”. And hence do not foresee a bright future for this class of prior…

## Particle Gibbs for conjugate mixture posteriors

Posted in Books, Statistics, University life with tags component of a mixture, conjugate priors, latent variable, mixtures of distributions, Pitman-Yor process, split and merge moves on September 8, 2015 by xi'an**A**lexandre Bouchard-Coté, Arnaud Doucet, and Andrew Roth have arXived a paper “Particle Gibbs Split-Merge Sampling for Bayesian Inference in Mixture Models” that proposes an efficient algorithm to explore the posterior distribution of a mixture, when interpreted as a clustering model. (To clarify the previous sentence, this is a regular plain vanilla mixture model for which they explore the latent variable representation.)

I like very much the paper because it relates to an earlier paper of mine with George Casella and Marty Wells, paper we wrote right after a memorable JSM in Baltimore (during what may have been my last visit to Cornell University as George left for Florida the following year). The starting point of this approach to mixture estimation is that the (true) parameters of a mixture can be (exactly) integrated out when using conjugate priors and a completion step. Namely, the marginal posterior distribution of the latent variables given the data is available in closed form. The latent variables being the component allocations for the observations. The joint posterior is then a product of the prior on the parameters *times* the prior on the latents times a product of simple (e.g., Gaussian) terms. This equivalently means the marginal likelihoods conditional on the allocations are available in closed form. Looking directly at those marginal likelihoods, a prior distribution on the allocations can be introduced (e.g., the Pitman-Yor process or the finite Dirichlet prior) and, together, they define a closed form target. Albeit complex. As often on a finite state space. In our paper with George and Marty, we proposed using importance sampling to explore the set, using for instance marginal distributions for the allocations, which are uniform in the case of exchangeable priors, but this is not very efficient, as exhibited by our experiments where very few partitions would get most of the weight.

Even a Gibbs sampler on subsets of those indicators restricted to two components cannot be managed directly. The paper thus examines a specially designed particle Gibbs solution that implements a split and merge move on two clusters at a time. Merging or splitting the subset. With intermediate target distributions, SMC style. While this is quite an involved mechanism, that could be deemed as excessive for the problem at hand, as well as inducing extra computing time, experiments in the paper demonstrate the mostly big improvement in efficiency brought by this algorithm.