## ABC model choice not to be trusted [3]

Posted in R, Statistics with tags , , , , , on January 31, 2011 by xi'an

On Friday, I received a nice but embarrassing email from Xavier Didelot. He indeed reminded me that I attended the talk he gave at the model choice workshop in Warwick last May, as, unfortunately but rather unsurprisingly giving my short span memory!, I had forgotten about it! Looking at the slides he joined to his email, I indeed remember attending the talk and expecting to get back to the results after the meeting. As I went from Warwick to Paris only to leave a day after for Benidorm, and the Valencia 9 meeting, in such a hurry that I even forgot my current black notebook, the plans of getting back to the talk got forgotten so completely that even reading the tech report (now appeared in Bayesian Analysis) could not rescind them!

## R exam

Posted in R, Statistics, University life with tags , , , on January 30, 2011 by xi'an

I spent most of my Saturday perusing R codes to check the answers written by my students to the R exam I gave two weeks ago… The outcome is mostly poor, even though some managed to solve a fair part of the long problem. Except for the few hopeless cases who visibly never wrote a single line of R code before the exam, all students have managed the basics of R programming and graphics, if not of Monte Carlo approximations or of boostrapping. One of the problems involved the distribution of a disk area and I found that half of the [third year math!] students do not know the $\pi R^2$ formula! Although I had repeatedly told them about the good training in trying to solve Le Monde puzzles (as well as checking my posts about them), only one student found the solution to puzzle #49

## Marie Curie [2]

Posted in pictures on January 30, 2011 by xi'an

As a coincidence, following the previous post about Marie Curie grants, the French postal services launched this weekend a new stamp in honour of both Marie Curie, who got her (second) Nobel Prize in Chemistry 100 years ago (the first one was in Physics in 1903) and of the International Year of  Chemistry. Since Pierre and Marie Curie lived in my home town of Sceaux, the stamp was first emitted here and I got a postcard with the new stamp (as well as an older stamp of 1967 commemorating Marie Curie’s 100th birthday). Another unfortunate coincidence about the Curies is that Pierre Curie got run over place Dauphine where the Université Paris-Dauphine now stands…

## Art brut

Posted in pictures, Travel with tags , on January 29, 2011 by xi'an

## Dauphine mathematician in Tunisian government!

Posted in University life with tags , on January 29, 2011 by xi'an

Our colleague at Dauphine Elyès Jouini (who also is the vice-president for research) joined the new Tunisian government as the Minister for economic reforms (Ministre auprès du premier ministre chargé des réformes économiques et sociales et de la coordination avec les ministères concernés). A praiseworthy if challenging move! All the best, Elyès!

## ABC model choice not to be trusted [2]

Posted in R, Statistics with tags , , , on January 28, 2011 by xi'an

As we were completing our arXiv summary about ABC model choice, we were helpfully pointed to a recent CRiSM tech. report by X. Didelot, R. Everitt, A. Johansen and D. Lawson on  Likelihood-free estimation of model evidence. This paper is quite related to our study of the performances of the ABC approximation to the Bayes factor, deriving in particular the limiting behaviour for the ratio,

$B_{12}(x) = \dfrac{g_1(x)}{g_2(x)}\,B^S_{12}(x).$

However, Didelot et al. reach the opposite conclusion from ours, namely that the problem can be solved by a sufficiency argument. Their point is that, when comparing models within exponential families (which is the natural realm for sufficient statistics), it is always possible to build an encompassing model with a sufficient statistic that remains sufficient across models. This construction of Didelot et al. is correct from a mathematical perspective, as seen for instance in the Poisson versus geometric example we first mentioned in Grelaud et al. (2009): adding

$\prod_{i=1}^n x_i!$

to the sum of the observables into a large sufficient statistic produces a ratio g1/g2 that is equal to 1.

Nonetheless, we do not think this encompassing property has a direct impact on the performances of ABC model choice. In practice, complex models do not enjoy sufficient statistics (if only because the overwhelming majority of them are not exponential families, with the notable exception of Gibbs random fields where the above agreement graph is derived). There is therefore a strict loss of information in using ABC model choice, due to the call both to insufficient statistics and to non-zero tolerances. Looking at what happens in the limiting case when one is relying on a common sufficient statistic is a formal study that brings light on the potentially huge discrepancy between the ABC-based Bayes factor and the true Bayes factor. This is why we consider that finding a solution in this formal case—while a valuable extension of the Gibbs random fields case—does not directly help towards the understanding of the discrepancy found in non-exponential complex models.

## ABC model choice not to be trusted

Posted in Mountains, R, Statistics, University life with tags , , , , , , , , , on January 27, 2011 by xi'an

This may sound like a paradoxical title given my recent production in this area of ABC approximations, especially after the disputes with Alan Templeton, but I have come to the conclusion that ABC approximations to the Bayes factor are not to be trusted. When working one afternoon in Park City with Jean-Michel and Natesh Pillai (drinking tea in front of a fake log-fire!), we looked at the limiting behaviour of the Bayes factor constructed by an ABC algorithm, ie by approximating posterior probabilities for the models from the frequencies of acceptances of simulations from those models (assuming the use of a common summary statistic to define the distance to the observations). Rather obviously (a posteriori!), we ended up with the true Bayes factor based on the distributions of the summary statistics under both models! Continue reading