I had not read this 2017 discussion of Bayesian mixture estimation by Michael Betancourt before I found it mentioned in a recent paper. Where he re-explores the issue of identifiability and label switching in finite mixture models. Calling somewhat abusively degenerate mixtures where all components share the same family, e.g., mixtures of Gaussians. Illustrated by Stan code and output. This is rather traditional material, in that the non-identifiability of mixture components has been discussed in many papers and at least as many solutions proposed to overcome the difficulties of exploring the posterior distribution. Including our 2000 JASA paper with Gilles Celeux and Merrilee Hurn. With my favourite approach being the label-free representations as a point process in the parameter space (following an idea of Peter Green) or as a collection of clusters in the latent variable space. I am much less convinced by ordering constraints: while they formally differentiate and therefore identify the individual components of a mixture, they partition the parameter space with no regard towards the geometry of the posterior distribution. With in turn potential consequences on MCMC explorations of this fragmented surface that creates barriers for simulated Markov chains. Plus further difficulties with inferior but attracting modes in identifiable situations.
Archive for JASA
identifying mixtures
Posted in Books, pictures, Statistics with tags clustering, finite mixtures, identifiability, JASA, label switching, MCMC, STAN on February 27, 2022 by xi'anordered allocation sampler
Posted in Books, Statistics with tags Data augmentation, Galaxy, Gibbs sampling, hidden Markov models, JASA, label switching, latent variable models, MCMC, partition function, random partition trees, SMC, statistical methodology on November 29, 2021 by xi'anRecently, Pierpaolo De Blasi and María Gil-Leyva arXived a proposal for a novel Gibbs sampler for mixture models. In both finite and infinite mixture models. In connection with Pitman (1996) theory of species sampling and with interesting features in terms of removing the vexing label switching features.
“The key idea is to work with the mixture components in the random order of appearance in an exchangeable sequence from the mixing distribution (…) In accordance with the order of appearance, we derive a new Gibbs sampling algorithm that we name the ordered allocation sampler. “
This central idea is thus a reinterpretation of the mixture model as the marginal of the component model when its parameter is distributed as a species sampling variate. An ensuing marginal algorithm is to integrate out the weights and the allocation variables to only consider the non-empty component parameters and the partition function, which are label invariant. Which reminded me of the proposal we made in our 2000 JASA paper with Gilles Celeux and Merrilee Hurn (one of my favourite papers!). And of the [first paper in Statistical Methodology] 2004 partitioned importance sampling version with George Casella and Marty Wells. As in the later, the solution seems to require the prior on the component parameters to be conjugate (as I do not see a way to produce an unbiased estimator of the partition allocation probabilities).
The ordered allocation sample considers the posterior distribution of the different object made of the parameters and of the sequence of allocations to the components for the sample written in a given order, ie y¹,y², &tc. Hence y¹ always gets associated with component 1, y² with either component 1 or component 2, and so on. For this distribution, the full conditionals are available, incl. the full posterior on the number m of components, only depending on the data through the partition sizes and the number m⁺ of non-empty components. (Which relates to the debate as to whether or not m is estimable…) This sequential allocation reminded me as well of an earlier 2007 JRSS paper by Nicolas Chopin. Albeit using particles rather than Gibbs and applied to a hidden Markov model. Funny enough, their synthetic dataset univ4 almost resembles the Galaxy dataset (as in the above picture of mine)!
Handbooks [not a book review]
Posted in Books, pictures, Statistics, University life with tags ABC, book reviews, CRC Press, Handbook of Approximate Bayesian computation, handbook of mixture analysis, JASA, Journal of the American Statistical Association, mixtures of distributions on October 26, 2021 by xi'ancausal inference makes it to Stockholm
Posted in Statistics with tags Biometrika, causal inference, Econometrica, econometrics, instrumental variables, JASA, Nobel Prize, Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel on October 12, 2021 by xi'anYesterday, Joshua Angrist and Guido Imbens, whose most cited paper is this JASA 1996 article with Don Rubin, were awarded the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel for 2021. It is one of these not-so-rare instances when econometricians get this prize, with causality the motive for their award. I presume this will not see the number of Biometrika submissions involving causal inference go down! (Imbens wrote a book on causal inference with Don Rubin, and is currently editor of Econometrica. And Angrist wrote Mostly Harmless Econometrics, with J.S. Pischke, which I have not read.)
ratio of Gaussians
Posted in Books, Statistics, University life with tags Biometrika, Cauchy distribution, cross validated, David Hinkley, Edgar Fieller, George Marsaglia, JASA, mixtures of distributions, ratio of random variates on April 12, 2021 by xi'anFollowing (as usual) an X validated question, I came across two papers of George Marsaglia on the ratio of two arbitrary (i.e. unnormalised and possibly correlated) Normal variates. One was a 1965 JASA paper,
where the density of the ratio X/Y is exhibited, based on the fact that this random variable can always be represented as (a+ε)/(b+ξ) where ε,ξ are iid N(0,1) and a,b are constant. Surprisingly (?), this representation was challenged in a 1969 paper by David Hinkley (corrected in 1970).
And less surprisingly the ratio distribution behaves almost like a Cauchy, since its density is
meaning it is a two-component mixture of a Cauchy distribution, with weight exp(-a²/2-b²/2), and of an altogether more complex distribution ƒ². This is remarked by Marsaglia in the second 2006 paper, although the description of the second component remains vague, besides a possible bimodality. (It could have a mean, actually.) The density ƒ² however resembles (at least graphically) the generalised Normal inverse density I played with, eons ago.