## ABC model choice not to be trusted

Posted in Mountains, R, Statistics, University life with tags , , , , , , , , , on January 27, 2011 by xi'an

This may sound like a paradoxical title given my recent production in this area of ABC approximations, especially after the disputes with Alan Templeton, but I have come to the conclusion that ABC approximations to the Bayes factor are not to be trusted. When working one afternoon in Park City with Jean-Michel and Natesh Pillai (drinking tea in front of a fake log-fire!), we looked at the limiting behaviour of the Bayes factor constructed by an ABC algorithm, ie by approximating posterior probabilities for the models from the frequencies of acceptances of simulations from those models (assuming the use of a common summary statistic to define the distance to the observations). Rather obviously (a posteriori!), we ended up with the true Bayes factor based on the distributions of the summary statistics under both models! Read more »

## Another ABC paper

Posted in Statistics with tags , , , , , , , on July 24, 2010 by xi'an

“One aim is to extend the approach of Sisson et al. (2007) to provide an algorithm that is robust to implement.”

C.C. Drovandi & A.N. Pettitt

A paper by Drovandi and Pettit appeared in the Early View section of Biometrics. It uses a combination of particles and of MCMC moves to adapt to the true target, with an acceptance probability

$\min\left\{1,\dfrac{\pi(\theta^*)q(\theta_c|\theta^*)}{\pi(\theta^*)q(\theta^*|\theta_c)}\right\}$

where $\theta^*$ is the proposed value and $\theta_c$ is the current value (picked at random from the particle population), while q is a proposal kernel used to simulate the proposed value. The algorithm is adaptive in that the previous population of particles is used to make the choice of the proposal q, as well as of the tolerance level $\epsilon_t$. Although the method is valid as a particle system applied in the ABC setting, I have difficulties to gauge the level of novelty of the method (then applied to a model of Riley et al., 2003, J. Theoretical Biology). Learning from previous particle populations to build a better kernel q is indeed a constant feature in SMC methods, from Sisson et al.’s ABC-PRC (2007)—note that Drovandi and Pettitt mistakenly believe the ABC-PRC method to include partial rejection control, as argued in this earlier post—, to Beaumont et al.’s ABC-PMC (2009). The paper also advances the idea of adapting the tolerance on-line as an $\alpha$ quantile of the previous particle population, but this is the same idea as in Del Moral et al.’s ABC-SMC. The only strong methodological difference, as far as I can tell, is that the MCMC steps are repeated “numerous times” in the current paper, instead of once as in the earlier papers. This however partly cancels the appeal of an O(N) order method versus the O() order PMC and SMC methods. An interesting remark made in the paper is that more advances are needed in cases when simulating the pseudo-observations is highly costly, as in Ising models. However, replacing exact simulation [as we did in the model choice paper] with a Gibbs sampler cannot be that detrimental.

## ABC model choice

Posted in Statistics with tags , , on January 24, 2010 by xi'an

I was re-reading the recently arXived paper by Toni and Stumpf on ABC based model choice and, besides noticing that their Gibbs random field example (§3.2) is the same as ours (§3.1), down to the prior choice, this led me to wonder about the choice of the ABC distance in those settings. On the one hand, the statistical perspective is to compare the predictive performances of different models and hence use the same distance for all models. On the other hand, the ABC perspective implies using different summary statistics for different models, hence using different distances… In a “true” model there is no issue because we end up comparing (margina) likelihoods but in an approximation like ABC, given that we replace the data with a summary statistic, then the distribution of a summary statistic with an indicator of proximity, we end up with paradoxes like this, where we compare pseudo-distributions of objects of different dimensions. (Toni and Stumpf made the choice in their paper of pulling all summary statistics together into an overall distance, while in ours we had the special Gibbs property of a sufficient statistic across models…)