Archive for mixtures

Bernoulli mixtures

Posted in pictures, Statistics, University life with tags , , , , , , , on October 30, 2019 by xi'an

An interesting query on (or from) X validated: given a Bernoulli mixture where the weights are known and the probabilities are jointly drawn from a Dirichlet, what is the most efficient from running a Gibbs sampler including the latent variables to running a basic Metropolis-Hastings algorithm based on the mixture representation to running a collapsed Gibbs sampler that only samples the indicator variables… I provided a closed form expression for the collapsed target, but believe that the most efficient solution is based on the mixture representation!

I thought I did make a mistake but I was wrong…

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , , on November 14, 2018 by xi'an

One of my students in my MCMC course at ENSAE seems to specialise into spotting typos in the Monte Carlo Statistical Methods book as he found an issue in every problem he solved! He even went back to a 1991 paper of mine on Inverse Normal distributions, inspired from a discussion with an astronomer, Caroline Soubiran, and my two colleagues, Gilles Celeux and Jean Diebolt. The above derivation from the massive Gradsteyn and Ryzhik (which I discovered thanks to Mary Ellen Bock when arriving in Purdue) is indeed incorrect as the final term should be the square root of 2β rather than 8β. However, this typo does not impact the normalising constant of the density, K(α,μ,τ), unless I am further confused.

infinite mixtures are likely to take a while to simulate

Posted in Books, Statistics with tags , , , , , , , , on February 22, 2018 by xi'an

Another question on X validated got me highly interested for a while, as I had considered myself the problem in the past, until I realised while discussing with Murray Pollock in Warwick that there was no general answer: when a density f is represented as an infinite series decomposition into weighted densities, some weights being negative, is there an efficient way to generate from such a density? One natural approach to the question is to look at the mixture with positive weights, f⁺, since it gives an upper bound on the target density. Simulating from this upper bound f⁺ and accepting the outcome x with probability equal to the negative part over the sum of the positive and negative parts f⁻(x)/f(x) is a valid solution. Except that it is not implementable if

  1.  the positive and negative parts both involve infinite sums with no exploitable feature that can turn them into finite sums or closed form functions,
  2.  the sum of the positive weights is infinite, which is the case when the series of the weights is not absolutely converging.

Even when the method is implementable it may be arbitrarily inefficient in the sense that the probability of acceptance is equal to to the inverse of the sum of the positive weights and that simulating from the bounding mixture in the regular way uses the original weights which may be unrelated in size with the actual importance of the corresponding components in the actual target. Hence, when expressed in this general form, the problem cannot allow for a generic solution.

Obviously, if more is known about the components of the mixture, as for instance the sequence of weights being alternated, there exist specialised methods, as detailed in the section of series representations in Devroye’s (1985) simulation bible. For instance, in the case when positive and negative weight densities can be paired, in the sense that their weighted difference is positive, a latent index variable can be included. But I cannot think of a generic method where the initial positive and negative components are used for simulation, as it may on the opposite be the case that no finite sum difference is everywhere positive.

a Ca’Foscari [first Italian-French statistics seminar]

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on October 26, 2017 by xi'an

Apart from subjecting my [surprisingly large!] audience to three hours of ABC tutorial today, and after running Ponte della la Libertà to Mestre and back in a deep fog, I attended the second part of the 1st Italian-French statistics seminar at Ca’Foscari, Venetiarum Universitas, with talks by Stéfano Tonellato and Roberto Casarin. Stéfano discussed a most interesting if puzzling notion of clustering via Dirichlet process mixtures. Which indeed puzzles me for its dependence on the Dirichlet measure and on the potential for an unlimited number of clusters as the sample size increases. The method offers similarities with an approach from our 2000 JASA paper on running inference on mixtures without proper label switching, in that looking at pairs of allocated observations to clusters is revealing about the [true or pseudo-true] number of clusters. With divergence in using eigenvalues of Laplacians on similarity matrices. But because of the potential for the number of components to diverge I wonder at the robustness of the approach via non-parametric [Bayesian] modelling. Maybe my difficulty stands with the very notion of cluster, which I find poorly defined and mostly in the eyes of the beholder! And Roberto presented a recent work on SURE and VAR models, with a great graphical representation of the estimated connections between factors in a sparse graphical model.

R typos

Posted in Books, Kids, R, Statistics, Travel, University life with tags , , , , , , , , on January 27, 2016 by xi'an

Amster14At MCMskv, Alexander Ly (from Amsterdam) pointed out to me some R programming mistakes I made in the introduction to Metropolis-Hastings algorithms I wrote a few months ago for the Wiley on-line encyclopedia! While the outcome (Monte Carlo posterior) of the corrected version is moderately changed this is nonetheless embarrassing! The example (if not the R code) was a mixture of a Poisson and a Geometric distributions borrowed from our testing as mixture paper. Among other things, I used a flat prior on the mixture weights instead of a Beta(1/2,1/2) prior and a simple log-normal random walk on the mean parameter instead of a more elaborate second order expansion discussed in the text. And I also inverted the probabilities of success and failure for the Geometric density. The new version is now available on arXiv, and hopefully soon on the Wiley site, but one (the?) fact worth mentioning here is that the (right) corrections in the R code first led to overflows, because I was using the Beta random walk Be(εp,ε(1-p)) which major drawback I discussed here a few months ago. With the drag that nearly zero or one values of the weight parameter produced infinite values of the density… Adding 1 (or 1/2) to each parameter of the Beta proposal solved the problem. And led to a posterior on the weight still concentrating on the correct corner of the unit interval. In any case, a big thank you to Alexander for testing the R code and spotting out the several mistakes…