At the MHC 2021 conference today (to which I biked to attend for real!, first time since BayesComp!) I listened to Christophe Biernacki exposing the dangers of EM applied to mixtures in the presence of missing data, namely that the algorithm has a rising probability to reach a degenerate solution, namely a single observation component. Rising in the proportion of missing data. This is not hugely surprising as there is a real (global) mode at this solution. If one observation components are prohibited, they should not be accepted in the EM update. Just as in Bayesian analyses with improper priors, the likelihood should bar single or double observations components… Which of course makes EM harder to implement. Or not?! MCEM, SEM and Gibbs are obviously straightforward to modify in this case.

Judith Rousseau also gave a fascinating talk on the properties of non-parametric mixtures, from a surprisingly light set of conditions for identifiability to posterior consistency . With an interesting use of several priors simultaneously that is a particular case of the cut models. Namely a correct joint distribution that cannot be a posterior, although this does not impact simulation issues. And a nice trick turning a hidden Markov chain into a fully finite hidden Markov chain as it is sufficient to recover a Bernstein von Mises asymptotic. If inefficient. Sylvain LeCorff presented a pseudo-marginal sequential sampler for smoothing, when the transition densities are replaced by unbiased estimators. With connection with approximate Bayesian computation smoothing. This proves harder than I first imagined because of the backward-sampling operations…

[Warning: Due to many CoI, from Nicolas being a former PhD student of mine, to his being a current colleague at CREST, to Omiros being co-deputy-editor for Biometrika, this review will not be part of my CHANCE book reviews.]

My friends Nicolas Chopin and Omiros Papaspiliopoulos wrote in 2020 An Introduction to Sequential Monte Carlo (Springer) that took several years to achieve and which I find remarkably coherent in its unified presentation. Particles filters and more broadly sequential Monte Carlo have expended considerably in the last 25 years and I find it difficult to keep track of the main advances given the expansive and heterogeneous literature. The book is also quite careful in its mathematical treatment of the concepts and, while the Feynman-Kac formalism is somewhat scary, it provides a careful introduction to the sampling techniques relating to state-space models and to their asymptotic validation. As an introduction it does not go to the same depths as Pierre Del Moral’s 2004 book or our 2005 book (Cappé et al.). But it also proposes a unified treatment of the most recent developments, including SMC² and ABC-SMC. There is even a chapter on sequential quasi-Monte Carlo, naturally connected to Mathieu Gerber’s and Nicolas Chopin’s 2015 Read Paper. Another significant feature is the articulation of the practical part around a massive Python package called particles [what else?!]. While the book is intended as a textbook, and has been used as such at ENSAE and in other places, there are only a few exercises per chapter and they are not necessarily manageable (as Exercise 7.1, the unique exercise for the very short Chapter 7.) The style is highly pedagogical, take for instance Chapter 10 on the various particle filters, with a detailed and separate analysis of the input, algorithm, and output of each of these. Examples are only strategically used when comparing methods or illustrating convergence. While the MCMC chapter (Chapter 15) is surprisingly small, it is actually an introducing of the massive chapter on particle MCMC (and a teaser for an incoming Papaspiloulos, Roberts and Tweedie, a slow-cooking dish that has now been baking for quite a while!).

At the Laplace demon’s seminar today (whose cool name I cannot tire of!), Nicolas Chopin gave a webinar with the above equally cool title. And the first slide debunking myths about SMC’s:

The second part of the talk is about a recent arXival Nicolas wrote with his student Hai-Dang DauI missed, about increasing the number of MCMC steps when moving the particles. Called waste-free SMC. Where only one fraction of the particles is updated, but this is enough to create a sort of independence from previous iterations of the SMC. (Hai-Dang Dau and Nicolas Chopin had to taylor their own convergence proof for this modification of the usual SMC. Producing a single-run assessment of the asymptotic variance.)

On the side, I heard about a very neat (if possibly toyish) example on estimating the number of Latin squares:

And the other item of information is that Nicolas’ and Omiros’ book, An Introduction to Sequential Monte Carlo, has now appeared! (Looking forward reading the parts I had not yet read.)

Seminars will be broadcast live on the fourth Thursday of the month from 1-2:15 pm Eastern time (18 GMT+2). Students will meet virtually with the speaker from 2:30-3:30 pm Eastern time. Talks in the fall will focus on Hidden Markov Models, starting on Thursday, September 24, 2020 with Ruth King of the University of Edinburgh.

There is a conference on mixtures (M) and hidden Markov models (H) and clustering (C) taking place in Orsay on June 17-19, next year. Registration is free if compulsory. With about twenty confirmed speakers. (Irrelevant as the following remark is, this is the opportunity to recall the conference on mixtures I organised in Aussois 25 years before! Which website is amazingly still alive at Duke, thanks to Mike West, my co-organiser along with Kathryn Roeder and Gilles Celeux. When checking the abstracts, I found only two presenters common to both conferences, Christophe Biernaki and Jiahua Chen. And alas several names of departed friends.)