Archive for Approximate Bayesian computation

learning optimal summary statistics

Posted in Books, pictures, Statistics with tags , , , , , , , , , on July 27, 2022 by xi'an

Despite the pursuit of the holy grail of sufficient statistics, most applications will have to settle for the weakest concept of optimal statistics.”Quiz #1: How does Bayes sufficiency [which preserves the posterior density] differ from sufficiency [which preserves the likelihood function]?

Quiz #2: How does Fisher-information sufficiency [which preserves the information matrix] differ from standard sufficiency [which preserves the likelihood function]?

Read a recent arXival by Till Hoffmann and Jukka-Pekka Onnela that I frankly found most puzzling… Maybe due to the Norman train where I was traveling being particularly noisy.

The argument in the paper is to find a summary statistic that minimises the [empirical] expected posterior entropy, which equivalently means minimising the expected Kullback-Leibler distance to the full posterior.  And maximizing the mutual information between parameters θ and summaries t(.). And maximizing the expected surprise. Which obviously requires breaking the sample into iid components and hence considering the gain brought by a specific transform of a single observation. The paper also contains a long comparison with other criteria for choosing summaries.

“Minimizing the posterior entropy would discard the sufficient statistic t such that the posterior is equal to the prior–we have not learned anything from the data.”

Furthermore, the expected aspect of the criterion takes us away from a proper Bayes analysis (and exhibits artifacts as the one above), which somehow makes me question the relevance of comparing entropies under different distributions. It took me a long while to realise that the collection of summaries was set by the user and quite limited. Like a neural network representation of the posterior mean. And the intractable posterior is further approximated by a closed-form function of the parameter θ and of the summary t(.). Using there a neural density estimator. Or a mixture density network.

potato tomato [w/o ABC]

Posted in Books, pictures, Statistics with tags , , , , , , on July 8, 2022 by xi'an

“The default parameter of KaKs_Calculator was set to estimate the Ka/Ks values, which means that the Ka/Ks value was the average of the output from 15 available algorithms comprising 7 original approximate methods and one maximum likelihood method.”

In their analysis of the philogenic evolution of the potato species, the Nature authors resort to a multiple analysis (à la EJ!) in the above sense, by averaging several results. I remain puzzled by the approach that treats all methods on an equal basis, without trying to ascertain precision and bias by X validation or other tools. (Approximate Bayesian Computing was not used as one of the methods.)

Recent Advances in Approximate Bayesian Inference [YSE, 15.2.22]

Posted in Statistics, University life with tags , , , , , on May 11, 2022 by xi'an


On June 15, the Young Statisticians Europe initiative is organising an on-line seminar on approximate Bayesian inference. With talks by

starting at 7:00 PT / 10:00 EST / 16:00 CET. The registration form is available here.

Concentration and robustness of discrepancy-based ABC [One World ABC ‘minar, 28 April]

Posted in Statistics, University life with tags , , , , , , , , , , , on April 15, 2022 by xi'an

Our next speaker at the One World ABC Seminar will be Pierre Alquier, who will talk about “Concentration and robustness of discrepancy-based ABC“, on Thursday April 28, at 9.30am UK time, with an abstract reported below.
Approximate Bayesian Computation (ABC) typically employs summary statistics to measure the discrepancy among the observed data and the synthetic data generated from each proposed value of the parameter of interest. However, finding good summary statistics (that are close to sufficiency) is non-trivial for most of the models for which ABC is needed. In this paper, we investigate the properties of ABC based on integral probability semi-metrics, including MMD and Wasserstein distances. We exhibit conditions ensuring the contraction of the approximate posterior. Moreover, we prove that MMD with an adequate kernel leads to very strong robustness properties.

likelihood-free nested sampling

Posted in Books, Statistics with tags , , , , , , on April 11, 2022 by xi'an

Last week, I came by chance across a paper by Jan Mikelson and Mustafa Khammash on a likelihood-free version of nested sampling (a popular keyword on the ‘Og!). Published in 2020 in PLoS Comput Biol. The setup is a parameterised and hidden state-space model, which allows for an approximation of the (observed) likelihood function L(θ|y) by means of a particle filter. An immediate issue with this proposal is that a novel  filter need be produced for a new value of the parameter θ, which makes it enormously expensive. It then gets more bizarre as the [Monte Carlo] distribution of the particle filter approximation ô(θ|y) is agglomerated with the original prior π(θ) as a joint “prior” [despite depending on the observed y] and a nested sampling is conducted with level sets of the form

ô(θ|y)>ε.

Actually, if the Monte Carlo error was null, that is, if the number of particles was infinite,

ô(θ|y)=L(θ|y)

implies that this is indeed the original nested sampler. Simulation from the restricted region is done by constructing an extra density estimator of the constrained distribution (in θ)…

“We have shown how using a Monte Carlo estimate over the livepoints not only results in an unbiased estimator of the Bayesian evidence Z, but also allows us to derive a formulation for a lower bound on the achievable variance in each iteration (…)”

As shown by the above the authors insist on the unbiasedness of the particle approximation, but since nested sampling is not producing an unbiased estimator of the evidence Z, the point is somewhat moot. (I am also rather surprised by the reported lack of computing time benefit in running ABC-SMC.)

%d bloggers like this: