Archive for summary statistics

Asymptotics of ABC when summaries converge at heterogeneous rates

Posted in pictures, Statistics, University life with tags , , , , , , , , , on November 21, 2023 by xi'an

We just posted a new arXival, jointly with Caroline Lawless, Judith Rousseau, and Robin Ryder. This is a significant component of Caroline’s PhD thesis in Oxford, on which we started working during the first COVID lockdown.  In this paper, we extend our results with David Frazier, Gael Martin, both with whom I’ll soon be reunited!, and Judith, published in Biometrika in 2018, to the more challenging case where different components of the summary statistic vector converge to their respective means at different rates, with some possibly not even converging at all. While this sounds impossible (!), we do prove consistency of the ABC posterior under such heterogeneous rates.

Wentao Li and Paul Fearnhead (also in Biometrika and in 2018)  reduce the curse of the dimension of the set of summary statistic by showing, in the specific case of asymptotically normal summary statistics concentrating at the same rate, that a local linear post-processing step leads to a significant improvement in the theoretical behaviour of the ABC posterior. However, due to this focus on reducing the impact of the dimension of the summary statistics, it is therefore important to study its efficiency in a context where the summary statistics are not as well behaved. Surprinsingly maybe, we show that the significant improvement due to local linear post-processing persists even when summary statistics have heterogeneous behaviour.  Most interestingly, the number of summary statistics which converge at the fast rate has no impact on the rate of posterior concentration nor on the shape of the ABC posterior (provided it exceeds the dimension of the parameter).

is it necessary to learn summary statistics? [One World ABC seminar]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on November 19, 2023 by xi'an

Next week, on 30 November, at 9am (UK time), Yanzhi Chen (Cambridge) will give a One World ABC webinar on Is “It Necessary to Learn Summary Statistics for Likelihood-free Inference?”, a PMLR paper join with Michael Guttman and Adrian Weller:

Likelihood-free inference (LFI) is a set of techniques for inference in implicit statistical models. A longstanding question in LFI has been how to design or learn good summary statistics of data, but this might now seem unnecessary due to the advent of recent end-to- end (i.e. neural network-based) LFI methods. In this work, we rethink this question with a new method for learning summary statistics. We show that learning sufficient statistics may be easier than direct posterior inference, as the former problem can be reduced to a set of low-dimensional, easy-to-solve learning problems. This suggests us to explicitly decouple summary statistics learning from posterior inference in LFI. Experiments on five inference tasks with different data types validate our hypothesis.

 

learning ABC summaries with autoencoders [webinar]

Posted in Statistics, University life with tags , , , , , , , , on October 12, 2023 by xi'an

The next One World ABC seminar will take place this Thursday, September 28, at 9am UK time and on-line.

Speaker: Carlo Albert, Swiss Federal Institute of Aquatic Science and Technology
Title: Learning summary statistics for Bayesian inference with Autoencoders
Abstract: In order for ABC to give accurate results and be efficient, we need summary statistics that retain most of the parameter-related information and cancel out most of the noise, respectively. For many scientific applications, we need strictly more summary statistics than model parameters to reach a satisfactory approximation of the posterior. Therefore, we propose to use a latent representation of deep neural networks based on Autoencoders as summary statistics. To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information on the noise that has been used to generate the training data. We validate the approach empirically on two types of stochastic models, one being a member of the exponential family, the other one not.

a versatile alternative to ABC

Posted in Books, Statistics with tags , , , , , , , , , on July 25, 2023 by xi'an

“We introduce the Fixed Landscape Inference MethOd, a new likelihood-free inference method for continuous state-space stochastic models. It applies deterministic gradient-based optimization algorithms to obtain a point estimate of the parameters, minimizing the difference between the data and some simulations according to some prescribed summary statistics. In this sense, it is analogous to Approximate Bayesian Computation (ABC). Like ABC, it can also provide an approximation of the distribution of the parameters.”

I quickly read this arXival by Monard et al. that is presented as an alternative to ABC, while outside a Bayesian setup. The central concept is that a deterministic gradient descent provides an optimal parameter value when replacing the likelihood with a distance between the observed data and simulated synthetic data indexed by the current value of the parameter (in the descent). In order to operate the descent the synthetic data is assumed to be available as a deterministic transform of the parameter value and of a vector of basic random objects, eg Uniforms. In order to make the target function differentiable, the above Uniform vector is fixed for the entire gradient descent. A puzzling aspect of the paper is that it seems to compare the (empirical) distribution of the resulting estimator with a posterior distribution, unless the comparison is with the (empirical) distribution of the Bayes estimators. The variability due to the choice of the fixed vector of basic random objects does not seem to be taken into account either, apparently. Furthermore, the method is presented as able to handle several models at once, which I find difficult to fathom as (a) the random vectors behind each model necessarily vary and (b) there is no apparent penalisation for complexity.

dynamic mixtures and frequentist ABC

Posted in Statistics with tags , , , , , , , , , , , , , , , on November 30, 2022 by xi'an

This early morning in NYC, I spotted this new arXival by Marco Bee (whom I know from the time he was writing his PhD with my late friend Bernhard Flury) and found he has been working for a while on ABC related problems. The mixture model he considers therein is a form of mixture of experts, where the weights of the mixture components are not constant but functions on (0,1) of the entry as well. This model was introduced by Frigessi, Haug and Rue in 2002 and is often used as a benchmark for ABC methods, since it is missing its normalising constant as in e.g.

f(x) \propto p(x) f_1(x) + (1-p(x)) f_2(x)

even with all entries being standard pdfs and cdfs. Rather than using a (costly) numerical approximation of the “constant” (as a function of all unknown parameters involved), Marco follows the approximate maximum likelihood approach of my Warwick colleagues, Javier Rubio [now at UCL] and Adam Johansen. It is based on the [SAME] remark that under a uniform prior and using an approximation to the actual likelihood the MAP estimator is also the MLE for that approximation. The approximation is ABC-esque in that a pseudo-sample is generated from the true model (attached to a simulation of the parameter) and the pair is accepted if the pseudo-sample stands close enough to the observed sample. The paper proposes to use the Cramér-von Mises distance, which only involves ranks. Given this “posterior” sample, an approximation of the posterior density is constructed and then numerically optimised. From a frequentist view point, a direct estimate of the mode would be preferable. From my Bayesian perspective, this sounds like a step backwards, given that once a posterior sample is available, reconnecting with an approximate MLE does not sound highly compelling.