In his plenary talk this morning, Arnaud Doucet discussed the application of pseudo-marginal techniques to the latent variable models he has been investigating for many years. And its limiting behaviour towards efficiency, with the idea of introducing correlation in the estimation of the likelihood ratio. Reducing complexity from O(T²) to O(T√T). With the very surprising conclusion that the correlation must go to 1 at a precise rate to get this reduction, since perfect correlation would induce a bias. A massive piece of work, indeed!

The next session of the morning was another instance of conflicting talks and I hoped from one room to the next to listen to Hani Doss’s empirical Bayes estimation with intractable constants (where maybe SAME could be of interest), Youssef Marzouk’s transport maps for MCMC, which sounds like an attractive idea provided the construction of the map remains manageable, and Paul Russel’s adaptive importance sampling that somehow sounded connected with our population Monte Carlo approach. (With the additional step of considering transform maps.)

An interesting item of information I got from the final announcements at MCqMC 2016 just before heading to Monash, Melbourne, is that MCqMC 2018 will take place in the city of Rennes, Brittany, on July 2-6. Not only it is a nice location on its own, but it is most conveniently located in space and time to attend ISBA 2018 in Edinburgh the week after! Just moving from one Celtic city to another Celtic city. Along with other planned satellite workshops, this occurrence should make ISBA 2018 more attractive [if need be!] for participants from oversea.

Following [time-wise] the AISTATS 2016 meeting, a machine learning school is organised in Cádiz (as is the tradition for AISTATS meetings in Europe, i.e., in even years). With an impressive [if downright scary] poster! There is no strong statistics component in the programme, apart from a course by Tamara Broderick on non-parametric Bayes, but the list of speakers is impressive and the ten day school is worth recommending for all interested students.  (I remember giving a short course at MLSS 2004 on Berder Island in Brittany, with the immediate reward of running the Auray-Vannes half-marathon that year…) The deadline for applications is March 25, 2016.

A couple of questions on X validated showed the difficulty students have with mixed measures and their density. Actually, my students always react with incredulity to the likelihood of a censored normal sample or to the derivation of a Bayes factor associated with the null (and atomic) hypothesis μ=0…

I attribute this difficulty to a poor understanding of the notion of density and hence to a deficiency in the training in measure theory, since the density f of the distribution F is always relative to a reference measure dμ, i.e.

f(x) = dF/dμ(x)

(Hence Lebesgue’s moustache on the attached poster!) To handle atoms in the distribution requires introducing a dominating measure dμ with atomic components, i.e., usually a sum of the Lebesgue measure and of the counting measure on the appropriate set. Which is not so absolutely obvious: while the first question had {0,1} as atoms, the second question introduced atoms on {-θ,θ}and required a change of variable to consider a counting measure on {-1,1}. I found this second question actually of genuine interest and a great toy example for class and exams.

As I have always been curious about my ancestry, I made a DNA test on 23andMe. While the company no longer provides statistics about potential medical conditions because of a lawsuit, it does return an ancestry analysis of sorts. In my case, my major ancestry composition is Anglo-Irish!  (with 39% of my DNA) and northern European (with 32%), while only 19% is Franco-German… In retrospect, not so much of a surprise—not because of my well-known Anglophilia but—given that my (known, i.e., at least for the direct ancestral branches) family roots are in Normandy—whose duke invaded Britain in 1056—and Brittany—which was invaded by British Celts fleeing Anglo-Saxons in the 400’s.  What’s maybe more surprising to me is that the database contained 23 people identified as 4th degree cousins and a total of 652 relatives… While the potential number of my potential 4th degree cousins stands in the 10,000’s, and hence there may indeed be a few ending up as 23andMe—mostly American—customers, I am indeed surprised that a .37% coincidence in our genes qualifies for being 4th degree cousins! But given that I only share 3.1% with my great⁴-grandfather, it actually make sense that I share about .1% to .4% with such remote cousins. However I wonder at the precision of such an allocation: could those cousins be even more remotely related? Not related at all? [Warning: All the links to 23andMe in this post are part of their referral program.]


This (early) summer, a conference on missing data will be organised in Rennes, Brittany, with the support of the French Statistical Society [SFDS]. (Check the website if interested, Rennes is a mere two hours from Paris by fast train.)

AISTATS 2014 / MLSS tutorial

Here are the slides of the tutorial on ABC methods I gave yesterday at both AISTAST 2014 and MLSS. (I actually gave a tutorial at another MLSS a few years ago, on the pretty island of Berder in Brittany, next to Vannes.) They are definitely similar to previous talks and tutorials I delivered on this topic of ABC algorithms, with only the last part being original (if unpublished yet). And even then: as Michael Gutmann from the University of Helsinki pointed out to me at the end of my talk, there are similarities between the classification method he exposed at MCMSki 4 in Chamonix and our use of random forests. Before my talk, I attended the tutorial of Roderick Murray-Smith from the University of Glasgow, on Machine learning and Human Computer Interaction, which was just stunning in its breadth, range of applications, and mastering of multimedia tools. Making me feel like a perfectly inadequate follower…