## ABC-SAEM

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , on October 8, 2019 by xi'an

In connection with the recent PhD thesis defence of Juliette Chevallier, in which I took a somewhat virtual part for being physically in Warwick, I read a paper she wrote with Stéphanie Allassonnière on stochastic approximation versions of the EM algorithm. Computing the MAP estimator can be done via some adapted for simulated annealing versions of EM, possibly using MCMC as for instance in the Monolix software and its MCMC-SAEM algorithm. Where SA stands sometimes for stochastic approximation and sometimes for simulated annealing, originally developed by Gilles Celeux and Jean Diebolt, then reframed by Marc Lavielle and Eric Moulines [friends and coauthors]. With an MCMC step because the simulation of the latent variables involves an untractable normalising constant. (Contrary to this paper, Umberto Picchini and Adeline Samson proposed in 2015 a genuine ABC version of this approach, paper that I thought I missed—although I now remember discussing it with Adeline at JSM in Seattle—, ABC is used as a substitute for the conditional distribution of the latent variables given data and parameter. To be used as a substitute for the Q step of the (SA)EM algorithm. One more approximation step and one more simulation step and we would reach a form of ABC-Gibbs!) In this version, there are very few assumptions made on the approximation sequence, except that it converges with the iteration index to the true distribution (for a fixed observed sample) if convergence of ABC-SAEM is to happen. The paper takes as an illustrative sequence a collection of tempered versions of the true conditionals, but this is quite formal as I cannot fathom a feasible simulation from the tempered version and not from the untempered one. It is thus much more a version of tempered SAEM than truly connected with ABC (although a genuine ABC-EM version could be envisioned).

## reaching transcendence for Gaussian mixtures

Posted in Books, R, Statistics with tags , , , , on September 3, 2015 by xi'an

“…likelihood inference is in a fundamental way more complicated than the classical method of moments.”

Carlos Amendola, Mathias Drton, and Bernd Sturmfels arXived a paper this Friday on “maximum likelihood estimates for Gaussian mixtures are transcendental”. By which they mean that trying to solve the five likelihood equations for a two-component Gaussian mixture does not lead to an algebraic function of the data. (When excluding the trivial global maxima spiking at any observation.) This is not highly surprising when considering two observations, 0 and x, from a mixture of N(0,1/2) and N(μ,1/2) because the likelihood equation

$(x-\mu)\exp\{\mu^2\}-x+\mu\exp\{-\mu(2x-\mu)\}=0$

involves both exponential and algebraic terms. While this is not directly impacting (statistical) inference, this result has the computational consequence that the number of critical points ‘and also the maximum number of local maxima, depends on the sample size and increases beyond any bound’, which means that EM faces increasing difficulties in finding a global finite maximum as the sample size increases…

## Bangalore workshop [ಬೆಂಗಳೂರು ಕಾರ್ಯಾಗಾರ] and new book

Posted in Books, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , on August 13, 2014 by xi'an

On the last day of the IFCAM workshop in Bangalore, Marc Lavielle from INRIA presented a talk on mixed effects where he illustrated his original computer language Monolix. And mentioned that his CRC Press book on Mixed Effects Models for the Population Approach was out! (Appropriately listed as out on a 14th of July on amazon!) He actually demonstrated the abilities of Monolix live and on diabets data provided by an earlier speaker from Kolkata, which was a perfect way to start initiating a collaboration! Nice cover (which is all I saw from the book at this stage!) that maybe will induce candidates to write a review for CHANCE. Estimation of those mixed effect models relies on stochastic EM algorithms developed by Marc Lavielle and Éric Moulines in the 90’s, as well as MCMC methods.

## speed of R, C, &tc.

Posted in R, Running, Statistics, University life with tags , , , , , , , , , on February 3, 2012 by xi'an

My Paris colleague (and fellow-runner) Aurélien Garivier has produced an interesting comparison of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that

• matlab is a lot faster than R and python, especially when vectorization is important : this is why the difference is spectacular on filtering/smoothing, not so much on the creation of the sample;
• octave is a good matlab emulator, if no special attention is payed to execution speed…;
• scilab appears as a credible, efficient alternative to matlab;
• still, C is a lot faster; the inefficiency of matlab in loops is well-known, and clearly shown in the creation of the sample.

(In this implementation, R is “only” three times slower than matlab, so this is not so damning…) All the codes are available and you are free to make suggestions to improve the speed of of your favourite language!

## Mixtures in Madrid (2)

Posted in Statistics, Travel, University life with tags , , , , , on April 13, 2011 by xi'an

Today I gave my first lecture in Universidad Autonoma Madrid. Apart from a shaky start due to my new computer not recognising the videoprojector, I covered EM for mixtures in the one and half hour of the course. I obviously finished the day with tapas in a nearby bar, vaguely watching Barcelona playing an improbable team to merge with the other patrons… In the second lecture, I hope to illustrate both EM and Gibbs on a simple mixture likelihood surface.

## Correlated Poissons

Posted in Statistics with tags , , on March 2, 2011 by xi'an

A graduate student came to see me the other day with a bivariate Poisson distribution and a question about using EM in this framework. The problem boils down to adding one correlation parameter and an extra term in the likelihood

$(1-\rho)^{n_1}(1+\lambda\rho)^{n_2}(1+\mu\rho)^{n_3}(1-\lambda\mu\rho)^{n_4}\quad 0\le\rho\le\min(1,\frac{1}{\lambda\mu})$

Both terms involving sums are easy to deal with, using latent variables as in mixture models. The subtractions are trickier, as the negative parts cannot appear in a conditional distribution. Even though the problem can be handled by a direct numerical maximisation or by an almost standard Metropolis-within-Gibbs sampler, my suggestion regarding EM per se was to proceed by conditional EM, one parameter at a time. For instance, when considering $\rho$ conditional on both Poisson parameters, depending on whether $\lambda\mu>1$ or not, one can consider either

$(1-\theta/\lambda\mu)^{n_1}(1+\theta/\mu)^{n_2}(1+\theta/\lambda)^{n_3}(1-\theta)^{n_4}\quad0<\theta<1$

and turn

$(1-\theta/\lambda\mu) \text{ into } (1-\theta+\theta\{1-\frac{1}{\lambda\mu}\})$

thus producing a Beta-like target function in $\theta$ after completion, or turn

$(1-\lambda\mu\rho) \text{ into } (1-\rho+\{1-\lambda\mu\}\rho)$

to produce a Beta-like target function in $\rho$ after completion. In the end, this is a rather pedestrian exercise and I am still frustrated at missing the trick to handle the subtractions directly, however it was nonetheless a nice question!

## Mixture Estimation and Applications

Posted in Books, Mountains, Statistics, University life with tags , , , , , , , , , , on November 24, 2010 by xi'an

We have now completed the edition of the book ﻿Mixture Estimation and Applications with Kerrie Mengersen and Mike Titterington, made of contributions from participants to  the ICMS workshop on mixtures that took place in Edinburgh last March. Here is the prospective table of contents: Continue reading