Archive for Approximate Bayesian computation

ABC with path signatures [One World ABC seminar, 2/2/23]

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , on January 29, 2023 by xi'an

The next One World ABC seminar is by Joel Dyer (Oxford) at 1:30pm (UK time) on 02 February.

Title: Approximate Bayesian Computation with Path Signatures

Abstract: Simulation models often lack tractable likelihood functions, making likelihood-free inference methods indispensable. Approximate Bayesian computation (ABC) generates likelihood-free posterior samples by comparing simulated and observed data through some distance measure, but existing approaches are often poorly suited to time series simulators, for example due to an independent and identically distributed data assumption. In this talk, we will discuss our work on the use of path signatures in ABC as a means to handling the sequential nature of time series data of different kinds. We will begin by discussing popular approaches to ABC and how they may be extended to time series simulators. We will then introduce path signatures, and discuss how signatures naturally lead to two instances of ABC for time series simulators. Finally, we will demonstrate that the resulting signature-based ABC procedures can produce competitive Bayesian parameter inference for simulators generating univariate, multivariate, irregularly spaced, and even non-Euclidean sequences.

Reference: J. Dyer, P. Cannon, S. M Schmon (2022). Approximate Bayesian Computation with Path Signatures. arXiv preprint 2106.12555

Adversarial Bayesian Simulation [One World ABC’minar]

Posted in Statistics with tags , , , , , , , , , on November 15, 2022 by xi'an

The next One World ABC webinar will take place on 24 November, at 1:30 UK Time (GMT) and will be presented by Yi Yuexi Wang (University of Chicago) on “Adversarial Bayesian Simulation”, available on arXiv. [The link to the webinar is available to those who have registered.]

In the absence of explicit or tractable likelihoods, Bayesians often resort to approximate Bayesian computation (ABC) for inference. In this talk, we will cover two summary-free ABC approaches, both inspired by adversarial learning. The first one adopts a classification-based KL estimator to quantify the discrepancy between real and simulated datasets. We consider the traditional accept/reject kernel as well as an exponential weighting scheme which does not require the ABC acceptance threshold. In the second paper, we develop a Bayesian GAN (B-GAN) sampler that directly targets the posterior by solving an adversarial optimization problem. B-GAN is driven by a deterministic mapping learned on the ABC reference by conditional GANs. Once the mapping has been trained, iid posterior samples are obtained by filtering noise at a negligible additional cost. We propose two post-processing local refinements using (1) data-driven proposals with importance reweighting, and (2) variational Bayes. For both methods, we support our findings with frequentist-Bayesian theoretical results and highly competitive performance in empirical analysis. (Joint work with Veronika Rockova)

Frontiers in Machine Learning and Economics: Methods and Applications

Posted in Statistics with tags , , , , , , , , on October 7, 2022 by xi'an

learning optimal summary statistics

Posted in Books, pictures, Statistics with tags , , , , , , , , , on July 27, 2022 by xi'an

Despite the pursuit of the holy grail of sufficient statistics, most applications will have to settle for the weakest concept of optimal statistics.”Quiz #1: How does Bayes sufficiency [which preserves the posterior density] differ from sufficiency [which preserves the likelihood function]?

Quiz #2: How does Fisher-information sufficiency [which preserves the information matrix] differ from standard sufficiency [which preserves the likelihood function]?

Read a recent arXival by Till Hoffmann and Jukka-Pekka Onnela that I frankly found most puzzling… Maybe due to the Norman train where I was traveling being particularly noisy.

The argument in the paper is to find a summary statistic that minimises the [empirical] expected posterior entropy, which equivalently means minimising the expected Kullback-Leibler distance to the full posterior.  And maximizing the mutual information between parameters θ and summaries t(.). And maximizing the expected surprise. Which obviously requires breaking the sample into iid components and hence considering the gain brought by a specific transform of a single observation. The paper also contains a long comparison with other criteria for choosing summaries.

“Minimizing the posterior entropy would discard the sufficient statistic t such that the posterior is equal to the prior–we have not learned anything from the data.”

Furthermore, the expected aspect of the criterion takes us away from a proper Bayes analysis (and exhibits artifacts as the one above), which somehow makes me question the relevance of comparing entropies under different distributions. It took me a long while to realise that the collection of summaries was set by the user and quite limited. Like a neural network representation of the posterior mean. And the intractable posterior is further approximated by a closed-form function of the parameter θ and of the summary t(.). Using there a neural density estimator. Or a mixture density network.

potato tomato [w/o ABC]

Posted in Books, pictures, Statistics with tags , , , , , , on July 8, 2022 by xi'an

“The default parameter of KaKs_Calculator was set to estimate the Ka/Ks values, which means that the Ka/Ks value was the average of the output from 15 available algorithms comprising 7 original approximate methods and one maximum likelihood method.”

In their analysis of the philogenic evolution of the potato species, the Nature authors resort to a multiple analysis (à la EJ!) in the above sense, by averaging several results. I remain puzzled by the approach that treats all methods on an equal basis, without trying to ascertain precision and bias by X validation or other tools. (Approximate Bayesian Computing was not used as one of the methods.)

%d bloggers like this: