Archive for Ebola virus

Ebola virus [and Mr. Bayes]

Posted in Statistics, Travel, University life with tags , , , , , , , on August 12, 2014 by xi'an

Just like after the Malaysian Airlines flight 370 disappearance, the current Ebola virus outbreak makes me feel we are sorely missing an emergency statistical force to react on urgent issues… It would indeed be quite valuable to have a team of statisticians at the ready to quantify risks and posterior probabilities and avoid media approximations. The situations calling for this reactive force abound. A few days ago I was reading about the unknown number of missing pro-West activists in Eastern Ukraine. Maybe statistical societies could join forces to set such an emergency team?! Whose goals are somewhat different from the great Statistics without Borders

As a side remark, the above philogeny is taken from Dudas and Rambaut’s recent paper in PLOS reassessing the family tree of the current Ebola virus(es) acting in Guinea. The tree is found using MrBayes, which delivers a posterior probability of 1 to this filiation! And concluding “that the rooting of this clade using the very divergent other ebolavirus species is very problematic.”

Inference in epidemic models w/o likelihoods

Posted in Statistics with tags , , , , , , , , , on July 21, 2010 by xi'an

“We discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.” McKinley, Cook, and Deardon, IJB

I only became aware last week of the paper Inference in epidemic models without likelihoods by McKinley, Cook and Deardon, published in the International Journal of Biostatistics in 2009. (Anyone can access the paper by becoming a guest, ie providing some information.) The paper is essentially a simulation experiment comparing ABC-MCMC and ABC-SMC with regular data augmentation MCMC. The authors experiment on the tolerance level, the choice of metric and of summary statistics, in an exponential inter-event process modelling. The setting is interesting, in particular because it applies to highly dangerous diseases like the Ebola fever—for which there is no known treatment and which is responsible for a 88% decline in observed chimpanzee populations since 2003!—. The conclusions are overall not highly surprising, namely that repeating simulations of the data points given one simulated parameter does not seem to contribute [much] to an improved approximation of the posterior by the ABC sample, that the tolerance level does not seem to be highly influential, that the choice of the summary statistics and of the calibration factors are important, and that ABC-SMC outperforms ABC-MCMC (MCMC remaining the reference). Slightly more surprising is the conclusion that the choice of the distance/metric influences the outcome. (I failed to read in the paper strong arguments supporting the above sentence stolen from the abstract.)

“There are always doubts that the estimated posterior really does correspond to the true posterior.” McKinley, Cook, and Deardon, IJB

On the “negative” side, this paper is missing the recent literature both on the nonparametric aspects of ABC and on the more adaptive [PMC] features of ABC-SMC, as processed in our Biometrika ABC-PMC paper and in Del Moral, Doucet and Jasra. (Again, this is not a criticism in that the paper got published in early 2009.) I think that using past simulations to build the proposal and the next tolerance and, why not!, the relevant statistics, would further the improvement brought by sequential methods. (The authors were aware of the correction of Sisson et al., and used instead the version of Toni et al. They also mention the arXived notes of Marc Beaumont, which started our prodding into ABC-PRC.) The comparison experiment is based on a single dataset, with fixed random walk variances for the MCMC algorithms, while the prior used in the simulation seems to me to be highly peaked around the true value (gamma rates of 0.1). Some of the ABC scenari do produce estimates that are rather far away from the references given by MCMC, take for instance the CABC-MCMC when the tolerance ε is 10 and R is 100.