Inference in epidemic models w/o likelihoods

“We discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.” McKinley, Cook, and Deardon, IJB

I only became aware last week of the paper Inference in epidemic models without likelihoods by McKinley, Cook and Deardon, published in the International Journal of Biostatistics in 2009. (Anyone can access the paper by becoming a guest, ie providing some information.) The paper is essentially a simulation experiment comparing ABC-MCMC and ABC-SMC with regular data augmentation MCMC. The authors experiment on the tolerance level, the choice of metric and of summary statistics, in an exponential inter-event process modelling. The setting is interesting, in particular because it applies to highly dangerous diseases like the Ebola fever—for which there is no known treatment and which is responsible for a 88% decline in observed chimpanzee populations since 2003!—. The conclusions are overall not highly surprising, namely that repeating simulations of the data points given one simulated parameter does not seem to contribute [much] to an improved approximation of the posterior by the ABC sample, that the tolerance level does not seem to be highly influential, that the choice of the summary statistics and of the calibration factors are important, and that ABC-SMC outperforms ABC-MCMC (MCMC remaining the reference). Slightly more surprising is the conclusion that the choice of the distance/metric influences the outcome. (I failed to read in the paper strong arguments supporting the above sentence stolen from the abstract.)

“There are always doubts that the estimated posterior really does correspond to the true posterior.” McKinley, Cook, and Deardon, IJB

On the “negative” side, this paper is missing the recent literature both on the nonparametric aspects of ABC and on the more adaptive [PMC] features of ABC-SMC, as processed in our Biometrika ABC-PMC paper and in Del Moral, Doucet and Jasra. (Again, this is not a criticism in that the paper got published in early 2009.) I think that using past simulations to build the proposal and the next tolerance and, why not!, the relevant statistics, would further the improvement brought by sequential methods. (The authors were aware of the correction of Sisson et al., and used instead the version of Toni et al. They also mention the arXived notes of Marc Beaumont, which started our prodding into ABC-PRC.) The comparison experiment is based on a single dataset, with fixed random walk variances for the MCMC algorithms, while the prior used in the simulation seems to me to be highly peaked around the true value (gamma rates of 0.1). Some of the ABC scenari do produce estimates that are rather far away from the references given by MCMC, take for instance the CABC-MCMC when the tolerance ε is 10 and R is 100.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: