## Semi-automatic ABC

Last Thursday Paul Fearnhead and Dennis Prangle posted  on arXiv a paper proposing an original approach to ABC. I read it rather quickly so I may miss some points in the paper but my overall feeling is of a proximity to Richard Wilkinson‘s exact ABC on an approximate target. The interesting input in the paper is that ABC is considered from a purely inferential viewpoint and calibrated for estimation purposes.

Indeed, Fearnhead and Prangle do not follow the “traditional” perspective of looking at ABC as a converging approximation to the true posterior density. As Richard Wilkinson, they take instead a randomised/noisy version of the summary statistics and derive a calibrated version of ABC, i.e. an algorithm that gives proper predictions, the jinx being that it is for the posterior given this randomised version of the summary statistics. This is therefore a tautological argument of sorts that I will call tautology #1. The interesting aspect of this switch of perspective is that the kernel K used in the acceptance probability

$\displaystyle{ K((s-s_\text{obs})/h)}$

does not have to sound as an estimate of the true sampling density as it appears in the (randomised) pseudo-model. (Everything collapses to the true model when the bandwidth h goes to zero.) The Monte Carlo error is taken into account through the average acceptance probability, which collapses to zero when h goes to zero, therefore a suboptimal choice!

What I would call tautology #2 stems from the comparison of ABC posteriors via a loss function

$(\theta_0-\hat\theta)^\text{T} A (\theta_0-\hat\theta)$

that ends up with the “best” asymptotic summary statistic being

$\mathbb{E}[\theta|y_\text{obs}].$

This follows from the choice of the loss function rather than from an intrinsic criterion… Now, using the posterior expectation as the summary statistics does make sense!  Especially  when the calibration constraint implies that the ABC approximation has the same posterior mean as the  true (randomised) posterior. Unfortunately it is parameterisation dependent and unlikely to be available in settings where ABC is necessary. In the semi-automatic implementation, the authors suggest to use a pilot run of ABC to approximate the above statistics. I wonder at the cost since a simulation experiment must be repeated for each simulated dataset (or sufficient statistic). The simplification in the paper follows from a linear regression on the parameters, thus linking the approach with Beaumont, Zhang and Balding (2002, Genetics).

Using the same evaluation via a posterior loss, the authors show that the “optimal” kernel is uniform over a region

$x^\text{T} A x < c$

where c makes a ball of volume 1. A significant remark is that the error evaluated by Fearnhead and Prangle is

$\text{tr}(A\Sigma) + h^2 \mathbb{E}_K[x^\text{T}Ax] + \dfrac{C_0}{h^d}$

which means that, due to the Monte Carlo error, the “optimal” value of h is not zero but akin to a non-parametric optimal speed in 2/(2+d). There should thus be a way to link this decision-theoretic approach with the one of Ratmann et al. since the latter take h to be part of the parameter vector.

### 9 Responses to “Semi-automatic ABC”

1. I read this paper recently and enjoyed. As you noted, the optimal summary statistics are the (Bayesian) estimators of the parameters. I am aware that you know data cloning method suggested by Subhash Lele. Now, my question is can we use the data cloning method to obtain ML estimators of the parameters and then use them as summary statistics?

• The SAME algorithm (also know as prior feedback, data cloning, MCMC maximum likelihood, multiple imputation Metropolis EM, &tc.) can be used to represent the MLE as the limit of a sequence of Bayes estimators against replicas of the data. Now, to compute those Bayes estimators w/o the likelihood function is a bit of getting oneself in a fine pickle, isn’t it?! I simply do not see how you can implement the first step of the suggestion: computing a Bayes estimator for a k-replicate of the original data. If this is feasible, the whole Bayesian analysis of the model is feasible and the ABC shortcut is then superfluous. Please tell me which part I am missing.

• Ahh, yes. You’re right. I made a mistake in thinking. Thanks for your answer. I was thinking that I can contribute in discussing by noting about using data cloning as an alternative approach to obtain summary statistics. You brought me !

• In a formal way, the MLE or the Bayes [posterior expectation] estimator could be the “best” summary statistic, were it available. This is one of the core ideas of the paper, actually. Paul Fearnhead and Dennis Prangle use an ABC proxy to the genuine Bayes estimator as a new summary for a second ABC round.

2. […] a revised version of their semi-automatic ABC paper. Compared with the earlier version commented on that post, the paper makes a better case for the ABC algorithm, when considered there from a purely […]

3. Dennis Prangle Says: