ABC with composite score functions
My friends Erlis Ruli, Nicola Sartori and Laura Ventura from Università degli Studi de Padova have just arXived a new paper entitled Approximate Bayesian Computation with composite score functions. While the paper provides a survey of composite likelihood methods, the core idea of the paper is to use the score function (of the composite likelihood) as the summary statistic,
when evaluated at the maximum composite likelihood at the observed data point. In the specific (but unrealistic) case of an exponential family, an ABC based on the score is asymptotically (i.e., as the tolerance ε goes to zero) exact. The choice of the composite likelihood thus induces a natural summary statistics and, as in our empirical likelihood paper, where we also use the score of a composite likelihood, the composite likelihoods that are available for computation are usually quite a few, thus leading to an automated choice of a summary statistic..
An interesting (common) feature in most examples found in this paper is that comparisons are made between ABC using the (truly) sufficient statistic and ABC based on the pairwise score function, which essentially relies on the very same statistics. So the difference, when there is a difference, pertains to the choice of a different combination of the summary statistics or, somehow equivalently to the choice of a different distance function. One of the examples starts from our MA(2) toy-example in the 2012 survey in Statistics and Computing. The composite likelihood is then based on the consecutive triplet marginal densities. As shown by the picture below, the composite version improves to some extent upon the original ABC solution using three autocorrelations.
A suggestion I would have about a refinement of the proposed method deals with the distance utilised in the paper, namely the sum of the absolute differences between the statistics. Indeed, this sum is not scaled at all, neither for regular ABC nor for composite ABC, while the composite likelihood perspective provides in addition to the score a natural metric through the matrix A(θ) [defined on page 12]. So I would suggest comparing the performances of the methods using instead this rescaling since, in my opinion and in contrast with a remark on page 13, it is relevant in some (many?) settings where the amount of information brought by the composite model widely varies from one parameter to the next.
December 12, 2013 at 6:13 pm
There is also this paper that uses the scores of an auxiliary model as summary statistics in ABC: http://ect-pigorsch.mee.uni-bonn.de/data/research/papers/Approximate_Bayesian_Computation_with_Indirect_Summary_Statistics.pdf
December 12, 2013 at 11:06 am
Christian, thanks for the suggestion. In theory, for ε small enough also the unscaled composite score will be fine. In practice, however, we believe you are right, although the rescaling should be done with J(θ), which is the variance of the composite score, and not with A(θ).
By the way, when J(θ) is unavailable, since we only need it evaluated at the composite MLE, we can easily estimate it by Monte Carlo.
December 12, 2013 at 10:46 am
I recall at ABCinRome a talk by Gael Martin, also using the score function as a summary statistic for ABC: slides at https://docs.google.com/file/d/0B9xD_hvlHiyeakRsazZ4ZmhKLWc/edit
December 12, 2013 at 11:13 am
Yes I remember Geal Martin’s talk. We also had a poster at ABCinRome on this topic:
Click to access poster_ABC2013.pdf
December 12, 2013 at 12:27 pm
Thanks Umberto, I am also working with Gael on this approach.