## Bayesian indirect inference

The paper with above title by Chris Drovandi and Tony Pettitt was presented by Chris Drovandi at a seminar in Warwick last week (when I was not there yet). But the talk made me aware of the paper. It is mostly a review of existing works on ABC and indirect inference, which was already considered (as an alternative) in Fearnhead’s and Prangle’s Read Paper, with simulations to illustrate the differences. In particular, it seems to draw from the recent preprint by Gleim and Pigorsch (preprint that I need to read now!). Preprint that draws a classification of indirect inference versions of ABC.

Indirect inference and its connections with ABC have been on my radar for quite a while, even though I never went farther than thinking of it, as it was developed by colleagues (and former teachers) at CREST, Christian Gouriéroux, Alain Monfort, and Eric Renault in the early 1990’s. Since it relies on an auxiliary model, somewhat arbitrary, indirect inference is rather delicate to incorporate within a Bayesian framework. In an ABC setting, indirect inference provides summary statistics (as estimators or scores) and possibly a distance. In their comparison, Drovandi and Pettitt analyse the impact of increasing the pseudo sample size in the simulated data. Rather unsurprisingly, the performances of ABC comparing true data of size n with synthetic data of size m>n are not great. However, there exists another way of reducing the variance in the synthetic data, namely by repeating simulations of samples of size n and averaging the indicators for proximity, resulting in a frequency rather than a 0-1 estimator. See e.g. Del Moral et al. (2009). In this sense, increasing the computing power reduces the variability of the ABC approximation. (And I thus fail to see the full relevance of Result 1.)

The authors make several assumptions of unicity that I somewhat find unclear. While assuming that the MLE for the auxiliary model is unique could make sense (Assumption 2), I do not understand the corresponding indexing of this estimator (of the auxiliary parameter) on the generating (model) parameter θ. It should only depend on the generated/simulated data x. The notion of a noisy mapping is just confusing to me. The assumption that the auxiliary score function at the auxiliary MLE for the observed data and for a simulated dataset (Assumption 3) is unique proceeds from the same spirit. I however fail to see why it matters so much. If the auxiliary MLE is the result of a numerical optimisation algorithm, the numerical algorithm may return local modes. This only adds to the approximative effect of the ABC-I schemes. Given that the paper does not produce convergence results for those schemes, unless the auxiliary model contains the genuine model, such theoretical assumptions do not feel that necessary. The paper uses normal mixtures as an auxiliary model: the multimodality of this model should not be such an hindrance (and reordering is transparent, i.e. does not “reduce the flexibility of the auxiliary model”, and does not “increase the difficulty of implementation”, as stated p.16).

The paper concludes from a numerical study to the superiority of the Bayesian indirect inference of Gallant and McCulloch (2009). Which simply replaces the true likelihood with the maximal auxiliary model likelihood estimated from a simulated dataset. (This is somehow similar to our use of the empirical likelihood in the PNAS paper.) It is however moderated by the cautionary provision that “the auxiliary model [should] describe the data well”. As for empirical likelihood, I would suggest resorting to this Bayesian indirect inference as a benchmark, providing a quick if possibly dirty reference against which to test more elaborate ABC schemes. Or other approximations, like empirical likelihood or Wood’s synthetic likelihood.

### 2 Responses to “Bayesian indirect inference”

1. Pierre Jacob Says: