Archive for Bayesian learning

step-dads with Bayesian design [One World ABC’minar, 21 March]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on March 18, 2024 by xi'an

The next One World ABC seminar is taking place (on-line, requiring pre-registration) on Thursday 21 March, 9:00am UK time, with Desi Ivanova (University of Oxford), speaking about Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design:

We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Step-wise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront before the experiment. However, rather than keeping this policy fixed, Step-DAD periodically updates it as data is gathered, refining it to the particular experimental instance. This allows it to improve both the adaptability and the robustness of the design strategy compared with existing approaches.

(Which reminded me of George’s book on design in 2008.)

is it necessary to learn summary statistics? [One World ABC seminar]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on November 19, 2023 by xi'an

Next week, on 30 November, at 9am (UK time), Yanzhi Chen (Cambridge) will give a One World ABC webinar on Is “It Necessary to Learn Summary Statistics for Likelihood-free Inference?”, a PMLR paper join with Michael Guttman and Adrian Weller:

Likelihood-free inference (LFI) is a set of techniques for inference in implicit statistical models. A longstanding question in LFI has been how to design or learn good summary statistics of data, but this might now seem unnecessary due to the advent of recent end-to- end (i.e. neural network-based) LFI methods. In this work, we rethink this question with a new method for learning summary statistics. We show that learning sufficient statistics may be easier than direct posterior inference, as the former problem can be reduced to a set of low-dimensional, easy-to-solve learning problems. This suggests us to explicitly decouple summary statistics learning from posterior inference in LFI. Experiments on five inference tasks with different data types validate our hypothesis.

 

Bayesian learning

Posted in Statistics with tags , , , , , , , , on May 4, 2023 by xi'an

“…many well-known learning-algorithms, such as those used in optimization, deep learning, and machine learning in general, can now be derived directly following the above scheme using a single algorithm”

The One World ABC webinar today was delivered by Emtiyaz Khan (RIKEN), about the Bayesian Learning Rule, following Khan and Rue 2021 arXival on Bayesian learning. (It had a great intro featuring a video of the speaker’s daughter learning about the purpose of a ukulele in her first year!) The paper argues about a Bayesian interpretation/version of gradient descent algorithms, starting with Zellner’s (1988, the year I first met him!) identity that the posterior is solution to

\min_q \mathbb E_q[\ell(\theta,x)] + KL(q||\pi)

when ℓ is the likelihood and π the prior. This identity can be generalised to an arbitrary loss function (also dependent on the data)  replacing the likelihood and considered for a posterior chosen within an exponential family just as variational Bayes. Ending up with a posterior adapted to this target (in the KL sense). The optimal hyperparameter or pseudo-hyperparameter of this approximation can be recovered by some gradient algorithm, recovering as well stochastic gradient and Newton’s methods. While constructing a prior out of a loss function would have pleased the late Herman Rubin, this is not the case, but rater an approach to deriving a generalised Bayes distribution within a parametric family, including mixtures of Gaussians. At some point in the talk, the uncertainty endemic to the Bayesian approach seeped back into the picture, but since most of the intuition came from machine learning, I was somewhat lost at the nature of this uncertainty.

 

 

the Bayesian learning rule [One World ABC’minar, 27 April]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on April 24, 2023 by xi'an

The next One World ABC seminar is taking place (on-line, requiring pre-registration) on 27 April, 9:30am UK time, with Mohammad Emtiyaz Khan (RIKEN-AIP, Tokyo) speaking about the Bayesian learning rule:

We show that many machine-learning algorithms are specific instances of a single algorithm called the Bayesian learning rule. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models. This includes classical algorithms such as ridge regression, Newton’s method, and Kalman filter, as well as modern deep-learning algorithms such as stochastic-gradient descent, RMSprop, and Dropout. The key idea in deriving such algorithms is to approximate the posterior using candidate distributions estimated by using natural gradients. Different candidate distributions result in different algorithms and further approximations to natural gradients give rise to variants of those algorithms. Our work not only unifies, generalizes, and improves existing algorithms, but also helps us design new ones.

mini-Bayes in Nature [and Paris-Saclay]

Posted in Books, Running, Statistics, University life with tags , , , , , , , , , , on February 7, 2023 by xi'an