Richard Wilkinson arXived a paper on accelerated ABC during MCMSki 4, paper that I almost missed when quickly perusing the daily list. This is another illustration of the “invasion of Gaussian processes” in ABC settings. Maybe under the influence of machine learning.
The paper starts with a link to the synthetic likelihood approximation of Wood (2010, Nature), as in Richard Everitt’s talk last week. Richard (W.) presents the generalised ABC as a kernel-based acceptance probability, using a kernel π(y|x), when y is the observed data and x=x(θ) the simulated one. He proposes a Gaussian process modelling for the log-likelihood (at the observed data y), with a quadratic (in θ) mean and Matérn covariance matrix. Hence the connection with Wood’s synthetic likelihood. Another connection is with Nicolas’ talk on QMC(MC): the θ’s are chosen following a Sobol sequence “in order to minimize the number of design points”. Which requires a reparameterisation to [0,1]p… I find this “uniform” exploration of the whole parameter space delicate to envision in complex parameter spaces and realistic problems, since the likelihood is highly concentrated on a tiny subregion of the original [0,1]p. Not mentioning the issue of the spurious mass on the boundaries of the hypercube possibly induced by the change of variable. The sequential algorithm of Richard also attempts at eliminating implausible zones of the parameter space. i.e. zones where the likelihood is essentially zero. My worries with this interesting notion are that (a) the early Gaussian process approximations may be poor and hence exclude zones they should not; (b) all Gaussian process approximations at all iterations must be saved; (c) the Sobol sequences apply to the whole [0,1]p at each iteration but the non-implausible region shrinks at each iteration, which induces a growing inefficiency in the algorithm. The Sobol sequence should be restricted to the previous non-implausible zone.
Overall, an interesting proposal that would need more prodding to understand whether or not it is robust to poor initialisation and complex structures. And a proposal belonging to the estimated likelihood branch of ABC, which makes use of the final Gaussian process approximation to run an MCM algorithm. Without returning to pseudo-data simulation, replacing it with log-likelihood simulation.
“These algorithms sample space randomly and naively and do not learn from previous simulations”
The above criticism is moderated in a footnote about ABC-SMC using the “current parameter value to determine which move to make next [but] parameters visited in previous iterations are not taken into account”. I still find it excessive in that SMC algorithms and in particular ABC-SMC algorithms are completely free to use the whole past to build the new proposal. This was clearly enunciated in our earlier population Monte Carlo papers. For instance, the complete collection of past particles can be recycled by weights computing thru our AMIS algorithm, as illustrated by Jukka Corander in one genetics application.