**T**oday I received in the mail a copy of the short book published by edp sciences after the courses we gave last year at the astrophysics summer school, in Autrans. Which contains a quick introduction to ABC extracted from my notes (which I still hope to turn into a book!). As well as a longer coverage of Bayesian foundations and computations by David Stenning and David van Dyk.

## Archive for ABC

## ABC intro for Astrophysics

Posted in Books, Kids, Mountains, R, Running, Statistics, University life with tags ABC, Approximate Bayesian computation, Autrans, Bayesian foundations, Bayesian methodology, Book, computational astrophysics, review, Statistics for Astrophysics, summer course, survey, Vercors on October 15, 2018 by xi'an## simulated summary statistics [in the sky]

Posted in Statistics with tags ABC, approximate likelihood, Bayes factor, computer-simulated model, cosmology, cosmostats, de-biasing, urbi et orbi on October 10, 2018 by xi'an**T**hinking it was related with ABC, although in the end it is not!, I recently read a baffling cosmology paper by Jeffrey and Abdalla. The data **d** there means an observed (summary) statistic, while the summary statistic is a transform of the parameter, μ(θ), which calibrates the distribution of the data. With nuisance parameters. More intriguing to me is the sentence that the correct likelihood of **d** is indexed by a simulated version of μ(θ), μ'(θ), rather than by μ(θ). Which seems to assume that the pseudo- or simulated data can be produced for the same value of the parameter as the observed data. The rest of the paper remains incomprehensible for I do not understand how the simulated versions are simulated.

“…the corrected likelihood is more than a factor of exp(30) more probable than the uncorrected. This is further validation of the corrected likelihood; the model (i.e. the corrected likelihood) shows a better goodness-of-fit.”

The authors further ressort to Bayes factors to compare corrected and uncorrected versions of the likelihoods, which leads (see quote) to picking the corrected version. But are they comparable as such, given that the corrected version involves simulations that are treated as supplementary data? As noted by the authors, the Bayes factor unsurprisingly goes to one as the number M of simulations grows to infinity, as supported by the graph below.

## Implicit maximum likelihood estimates

Posted in Statistics with tags ABC, Approximate Bayesian computation, GANs, Hyvärinen score, Kullback-Leibler divergence, likelihood-free methods, maximum likelihood estimation, NIPS 2018, Peter Diggle, untractable normalizing constant, Wasserstein distance on October 9, 2018 by xi'an**A**n ‘Og’s reader pointed me to this paper by Li and Malik, which made it to arXiv after not making it to NIPS. While the NIPS reviews were not particularly informative and strongly discordant, the authors point out in the comments that they are available for the sake of promoting discussion. (As made clear in earlier posts, I am quite supportive of this attitude! *Disclaimer: I was not involved in an evaluation of this paper, neither for NIPS nor for another conference or journal!!*) Although the paper does not seem to mention ABC in the setting of implicit likelihoods and generative models, there is a reference to the early (1984) paper by Peter Diggle and Richard Gratton that is often seen as the ancestor of ABC methods. The authors point out numerous issues with solutions proposed for parameter estimation in such implicit models. For instance, for GANs, they signal that “minimizing the Jensen-Shannon divergence or the Wasserstein distance between the empirical data distribution and the model distribution does not necessarily minimize the same between the true data distribution and the model distribution.” (Not mentioning the particular difficulty with Bayesian GANs.) Their own solution is the implicit maximum likelihood estimator, which picks the value of the parameter θ bringing a simulated sample the closest to the observed sample. Closest in the sense of the Euclidean distance between both samples. Or between the minimum of several simulated samples and the observed sample. (The modelling seems to imply the availability of n>1 observed samples.) They advocate using a stochastic gradient descent approach for finding the optimal parameter θ which presupposes that the dependence between θ and the simulated samples is somewhat differentiable. (And this does not account for using a min, which would make differentiation close to impossible.) The paper then meanders in a lengthy discussion as to whether maximising the likelihood makes sense, with a rather naïve view on why using the empirical distribution in a Kullback-Leibler divergence does not make sense! What does not make sense is considering the finite sample approximation to the Kullback-Leibler divergence with the true distribution in my opinion.

## optimal proposal for ABC

Posted in Statistics with tags ABC, ABC-PMC, ABC-SMC, adaptive importance sampling, Bayesian Analysis, computational astrophysics, effective sample size, Ewan Cameron, kernel density estimator, Kullback-Leibler divergence, mixtures of distributions on October 8, 2018 by xi'an**A**s pointed out by Ewan Cameron in a recent c’Og’ment, Justin Alsing, Benjamin Wandelt, and Stephen Feeney have arXived last August a paper where they discuss an optimal proposal density for ABC-SMC and ABC-PMC. Optimality being understood as maximising the effective sample size.

“Previous studies have sought kernels that are optimal in the (…) Kullback-Leibler divergence between the proposal KDE and the target density.”

The effective sample size for ABC-SMC is actually the regular ESS multiplied by the fraction of accepted simulations. Which surprisingly converges to the ratio

**E**[q(θ)/π(θ)|**D**]/**E**[π(θ)/q(θ)|**D**]

under the (true) posterior. (Where q(θ) is the importance density and π(θ) the prior density.] When optimised in q, this usually produces an implicit equation which results in a form of geometric mean between posterior and prior. The paper looks at approximate ways to find this optimum. Especially at an upper bound on q. Something I do not understand from the simulations is that the starting point seems to be the plain geometric mean between posterior and prior, in a setting where the posterior is supposedly unavailable… Actually the paper is silent on how the optimal can be approximated in practice, for the very reason I just mentioned. Apart from using a non-parametric or mixture estimate of the posterior after each SMC iteration, which may prove extremely costly when processed through the optimisation steps. However, an interesting if side outcome of these simulations is that the above geometric mean does much better than the posterior itself when considering the effective sample size.

## ABC in print

Posted in Books, pictures, Statistics, University life with tags ABC, Approximate Bayesian computation, CRC Press, hanbook, Handbook of Approximate Bayesian computation, handbook of mixture analysis, likelihood-free methods, Mark Beaumont, Scott Sisson, Yanan Fan on September 5, 2018 by xi'an**T**he CRC Press Handbook of ABC is now out, after a rather long delay [the first version of our model choice chapter was written in 2015!] due to some late contributors Which is why I did not spot it at JSM 2018. As announced a few weeks ago, our Handbook of Mixture Analysis is soon to be published as well. (Not that I necessarily advocate the individual purchase of these costly volumes!, especially given most chapters are available on-line.)

## ABC for vampires

Posted in Books, pictures, Statistics, University life with tags ABC, ABCpy, Bhattacharya distance, likelihood-free methods, platelet, Python on September 4, 2018 by xi'an**R**itabrata Dutta (Warwick), along with coauthors including Anto Mira, published last week a paper in frontiers in physiology about using ABC for deriving the posterior distribution of the parameters of a dynamic blood (platelets) deposition model constructed by Bastien Chopard, the second author. While based on only five parameters, the model does not enjoy a closed form likelihood and even the simulation of a new platelet deposit takes about 10 minutes. The paper uses the simulated annealing ABC version, due to Albert, Künsch, and Scheidegger (2014), which relies a sequence of Metropolis kernels, associated with a decreasing sequence of tolerances, and claims better efficiency at reaching a stable solution. It also relies on the language abcpy, written by Ritabrata Dutta, in Python, for various aspects of ABC analysis. One feature of interest is the use of 24 summary statistics to conduct the inference on the 5 model parameters, a ratio of 24 to 5 that possibly gets improved by a variable selection tool such as random forests. Which would also avoid the choice of a specific loss function called the Bhattacharya distance (which sounds like entropy distance for the normal case).

## asymptotic properties of ABC now appeared

Posted in Books, Statistics, University life with tags ABC, ABC convergence, Approximate Bayesian computation, approximate Bayesian inference, Biometrika, intractable likelihood, summary statistics on September 1, 2018 by xi'an**O**ur paper with David Frazier, Gael Martin and Judith Rousseau has appeared in print in Biometrika, Volume 105, Issue 3, 1 September 2018, Pages 593–607, almost exactly two years after it was submitted. I am quite glad with the final version, though, and grateful for the editorial input, as the paper clearly characterises the connection between the tolerance level ε and the convergence rate of the summary statistic to its parameter identifying asymptotic mean. Asymptotic in the sample size, that is.