ABC with indirect summary statistics
After reading Drovandi’s and Pettitt’s Bayesian Indirect Inference, I checked (in the plane to Birmingham) the earlier Gleim’s and Pigorsch’s Approximate Bayesian Computation with indirect summary statistics. The setting is indeed quite similar to the above, with a description of three ways of connecting indirect inference with ABC, albeit with a different range of illustrations. This preprint states most clearly its assumption that the generating model is a particular case of the auxiliary model, which sounds anticlimactic since the auxiliary model is precisely used because the original one is mostly out of reach! This certainly was the original motivation for using indirect inference.
The part of the paper that I find the most intriguing is the argument that the indirect approach leads to sufficient summary statistics, in the sense that they “are sufficient for the parameters of the auxiliary model and (…) sufficiency carries over to the model of interest” (p.31). Looking at the details in the Appendix, I found that the argument is lacking, because the likelihood as a functional is shown to be a (sufficient) statistic, which seems both a tautology and irrelevant because this is different from the likelihood considered at the (auxiliary) MLE, which is the summary statistic used in fine.
“…we expand the square root of an innovation density h in a Hermite expansion and truncate the infinite polynomial at some integer K which, together with other tuning parameters of the SNP density, has to be determined through a model selection criterion (such as BIC). Now we take the leading term of the Hermite expansion to follow a Gaussian GARCH model.”
As in Drovandi and Pettitt, the performances of the ABC-I schemes are tested on a toy example, which is a very basic exponential iid sample with a conjugate prior. With a gamma model as auxiliary. The authors use a standard ABC based on the first two moments as their benchmark, however they do not calibrate those moments in the distance and end up with poor performances of ABC (in a setting where there is a sufficient statistic!). The best choice in this experiment appears as the solution based on the score, but the variances of the distances are not included in the comparison tables. The second implementation considered in the paper is a rather daunting continuous-time non-Gaussian Ornstein-Uhlenbeck stochastic volatility model à la Barndorf-Nielsen and Shephard (2001). The construction of the semi-nonparametric (why not semi-parametric?) auxiliary model is quite involved as well, as illustrated by the quote above. The approach provides an answer, with posterior ABC-IS distributions on all parameters of the original model, which kindles the question of the validation of this answer in terms of the original posterior. Handling simultaneously several approximation processes would help in this regard.
February 3, 2014 at 12:55 am
Actually even if the true model was incorporated in the auxiliary then ABC II methods still do not produce sufficient statistics (in general, via a simple dimensionality of summary statistic argument). Thus the assumption is not particularly useful in the context of ABC II methods, since even if it were true ABC II will not converge to the true posterior (as the ABC tolerance goes to zero). This gives me a slightly uneasy feeling about ABC II. In contrast, the Bayesian indirect likelihood (BIL) method I review and investigate further in my paper does appear to converge to the true posterior (as the sample size of the simulated data goes to infinity) if this assumption is met. But of course this assumption will not in general be satisfied in practice, which then gives an uneasy feeling about BIL. I have a simple example on a slide of the talk I have given at Warwick, Linz and Reading now (and plan to incorporate all this in a revision to my paper).