Archive for intractable likelihood

MCMC without evaluating the target [aatB-mMC joint seminar, 24 April]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on April 11, 2024 by xi'an

On 24 April 2024, Guanyang Wang (Rutgers University, visiting ESSEC) will give a joint All about that Bayes – mostly Monte Carlo seminar on

MCMC when you do not want to evaluate the target distribution

In sampling tasks, it is common for target distributions to be known up to a normalizing constant. However, in many situations, evaluating even the unnormalized distribution can be costly or infeasible. This issue arises in scenarios such as sampling from the Bayesian posterior for large datasets and the ‘doubly intractable’ distributions. We provide a way to unify various MCMC algorithms, including several minibatch MCMC algorithms and the exchange algorithm. This framework not only simplifies the theoretical analysis of existing algorithms but also creates new algorithms. Similar frameworks exist in the literature, but they concentrate on different objectives.

The talk takes place at 4pm CEST, in room 8 at PariSanté Campus, Paris 15.

Bayesian model averaging with exact inference of likelihood- free scoring rule posteriors [23/01/2024, PariSanté campus]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on January 16, 2024 by xi'an

A special “All about that Bayes” seminar in Paris (PariSanté campus, 23/01, 16:00-17:00) next week by my Warwick collegue and friend Rito:

Bayesian Model Averaging with exact inference of likelihood- free Scoring Rule Posteriors

Ritabrata Dutta, University of Warwick

A novel application of Bayesian Model Averaging to generative models parameterized with neural networks (GNN) characterized by intractable likelihoods is presented. We leverage a likelihood-free generalized Bayesian inference approach with Scoring Rules. To tackle the challenge of model selection in neural networks, we adopt a continuous shrinkage prior, specifically the horseshoe prior. We introduce an innovative blocked sampling scheme, offering compatibility with both the Boomerang Sampler (a type of piecewise deterministic Markov process sampler) for exact but slower inference and with Stochastic Gradient Langevin Dynamics (SGLD) for faster yet biased posterior inference. This approach serves as a versatile tool bridging the gap between intractable likelihoods and robust Bayesian model selection within the generative modelling framework.

a football post?!

Posted in Statistics with tags , , , , , , , , , , , , , , on June 22, 2022 by xi'an

I am not interested in football, neither as a player (a primary school trauma when I was the last being picked!) or as a fan, contrary to my dad (who was a football referee in his youth) and my kids, but Gareth Roberts (University of Warwick) and Jeff Rosenthal wrote a paper on football draws for the (FIFA) World Cup, infamously playing in Qatar by the end of the year, which Gareth presented in a Warwick seminar.

For this tournament, there are 32 teams, first playing against opponent teams supposedly drawn from a uniform distribution over all draw assignments, within 8 groups of 4 teams, with constraints like 1-2 EU teams per group, 0-1 from the other regions. As done at the moment and on TV, the tournament is filled one team at time by drawing from Pot 1, then Pot 2, then Pot 3, & Pot 4. &tc.. Applying the constraints one draw at a time, conditional on the past draws and the constraints, rather obviously creates non-uniformity! Uniformity would be achievable by rejection sampling (with a success probability of 1/540!) But this is not televisesque enough…

A debiasing solution is found by using several balls for each team in the right proportion, correcting for the sequential draws. Still impractical when requiring 10¹⁴ balls…!

The fun in their paper is that the problem can be formulated as a particle filter, estimating the right probabilities by randomising the number of balls [hidden randomness] and estimating the probability for team j to be included by a few thousands draws. With some stratified sampling on the side to minimise randomness. Removing the need for the (intractable?) distribution is thus achieved by retrospective sampling, as in pseudo-marginal MCMC. Alternatively, one could swap pairs of teams by a simplistic MCMC algorithm, with no worry about stationarity and the possibility of on-screen draws. (Jeff devised a Java applet to simulate an actual draw.) Obviously, it is still a far stretch that this proposal will be implemented for the next World Cup. If so, I will watch it!

mining gold [ABC in PNAS]

Posted in Books, Statistics with tags , , , , , , , , , , , on March 13, 2020 by xi'an

Johann Brehmer and co-authors have just published a paper in PNAS entitled “Mining gold from implicit models to improve likelihood-free inference”. (Besides the pun about mining gold, the paper also involves techniques named RASCAL and SCANDAL, respectively! For Ratio And SCore Approximate Likelihood ratio and SCore-Augmented Neural Density Approximates Likelihood.) This setup is not ABC per se in that their simulator is used both to generate training data and construct a tractable surrogate model. Exploiting Geyer’s (1994) classification trick of expressing the likelihood ratio as the optimal classification ratio when facing two equal-size samples from one density and the other.

“For all these inference strategies, the augmented data is particularly powerful for enhancing the power of simulation-based inference for small changes in the parameter θ.”

Brehmer et al. argue that “the most important novel contribution that differentiates our work from the existing methods is the observation that additional information can be extracted from the simulator, and the development of loss functions that allow us to use this “augmented” data to more efficiently learn surrogates for the likelihood function.” Rather than starting from a statistical model, they also seem to use a scientific simulator made of multiple layers of latent variables z, where

x=F⁰(u⁰,z¹,θ), z¹=G¹(u¹,z²), z²=G¹(u²,z³), …

although they also call the marginal of x, p(x|θ), an (intractable) likelihood.

“The integral of the log is not the log of the integral!”

The central notion behind the improvement is a form of Rao-Blackwellisation, exploiting the simulated z‘s. Joint score functions and joint likelihood ratios are then available. Ignoring biases, the authors demonstrate that the closest approximation to the joint likelihood ratio and the joint score function that only depends on x is the actual likelihood ratio and the actual score function, respectively. Which sounds like an older EM result, except that the roles of estimate and target quantity are somehow inverted: one is approximating the marginal with the joint, while the marginal is the “best” approximation of the joint. But in the implementation of the method, an estimate of the (observed and intractable) likelihood ratio is indeed produced towards minimising an empirical loss based on two simulated samples. Learning this estimate ê(x) then allows one to use it for the actual data. It however requires fitting a new ê(x) for each pair of parameters. Providing as well an estimator of the likelihood p(x|θ). (Hence the SCANDAL!!!) A second type of approximation of the likelihood starts from the approximate value of the likelihood p(x|θ⁰) at a fixed value θ⁰ and expands it locally as an exponential family shift, with the score t(x|θ⁰) as sufficient statistic.

I find the paper definitely interesting even though it requires the representation of the (true) likelihood as a marginalisation over multiple layers of latent variables z. And does not provide an evaluation of the error involved in the process when the model is misspecified. As a minor supplementary appeal of the paper, the use of an asymmetric Galton quincunx to illustrate an intractable array of latent variables will certainly induce me to exploit it in projects and courses!

[Disclaimer: I was not involved in the PNAS editorial process at any point!]

in Bristol for the day

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on February 28, 2020 by xi'an

I am in Bristol for the day, giving a seminar at the Department of Statistics where I had not been for quite a while (and not since the Department has moved to a beautifully renovated building). The talk is on ABC-Gibbs, whose revision is on the verge of being resubmitted. (I also hope Greta will let me board my plane tonight…)