## probabilistic programming au collège [de France]

Posted in Statistics with tags , , , , , , on June 24, 2022 by xi'an

## mining gold [ABC in PNAS]

Posted in Books, Statistics with tags , , , , , , , , , , , on March 13, 2020 by xi'an

Johann Brehmer and co-authors have just published a paper in PNAS entitled “Mining gold from implicit models to improve likelihood-free inference”. (Besides the pun about mining gold, the paper also involves techniques named RASCAL and SCANDAL, respectively! For Ratio And SCore Approximate Likelihood ratio and SCore-Augmented Neural Density Approximates Likelihood.) This setup is not ABC per se in that their simulator is used both to generate training data and construct a tractable surrogate model. Exploiting Geyer’s (1994) classification trick of expressing the likelihood ratio as the optimal classification ratio when facing two equal-size samples from one density and the other.

“For all these inference strategies, the augmented data is particularly powerful for enhancing the power of simulation-based inference for small changes in the parameter θ.”

Brehmer et al. argue that “the most important novel contribution that differentiates our work from the existing methods is the observation that additional information can be extracted from the simulator, and the development of loss functions that allow us to use this “augmented” data to more efficiently learn surrogates for the likelihood function.” Rather than starting from a statistical model, they also seem to use a scientific simulator made of multiple layers of latent variables z, where

x=F⁰(u⁰,z¹,θ), z¹=G¹(u¹,z²), z²=G¹(u²,z³), …

although they also call the marginal of x, p(x|θ), an (intractable) likelihood.

“The integral of the log is not the log of the integral!”

The central notion behind the improvement is a form of Rao-Blackwellisation, exploiting the simulated z‘s. Joint score functions and joint likelihood ratios are then available. Ignoring biases, the authors demonstrate that the closest approximation to the joint likelihood ratio and the joint score function that only depends on x is the actual likelihood ratio and the actual score function, respectively. Which sounds like an older EM result, except that the roles of estimate and target quantity are somehow inverted: one is approximating the marginal with the joint, while the marginal is the “best” approximation of the joint. But in the implementation of the method, an estimate of the (observed and intractable) likelihood ratio is indeed produced towards minimising an empirical loss based on two simulated samples. Learning this estimate ê(x) then allows one to use it for the actual data. It however requires fitting a new ê(x) for each pair of parameters. Providing as well an estimator of the likelihood p(x|θ). (Hence the SCANDAL!!!) A second type of approximation of the likelihood starts from the approximate value of the likelihood p(x|θ⁰) at a fixed value θ⁰ and expands it locally as an exponential family shift, with the score t(x|θ⁰) as sufficient statistic.

I find the paper definitely interesting even though it requires the representation of the (true) likelihood as a marginalisation over multiple layers of latent variables z. And does not provide an evaluation of the error involved in the process when the model is misspecified. As a minor supplementary appeal of the paper, the use of an asymmetric Galton quincunx to illustrate an intractable array of latent variables will certainly induce me to exploit it in projects and courses!

[Disclaimer: I was not involved in the PNAS editorial process at any point!]

## a quincunx on NBC

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , , , , on December 3, 2017 by xi'an

Through Five-Thirty-Eight, I became aware of a TV game call The Wall [so appropriate for Trumpian times!] that is essentially based on Galton’s quincunx! A huge [15m!] high version of Galton’s quincunx, with seven possible starting positions instead of one, which kills the whole point of the apparatus which is to demonstrate by simulation the proximity of the Binomial distribution to the limiting Normal (density) curve.

But the TV game has obvious no interest in the CLT, or in the Beta binomial posterior, only in a visible sequence of binary events that turn out increasing or decreasing the money “earned” by the player, the highest sums being unsurprisingly less likely. The only decision made by the player is to pick one of the seven starting points (meaning the outcome should behave like a weighted sum of seven Normals with drifted means depending on the probabilities of choosing these starting points). I found one blog entry analysing an “idiot” strategy of playing the game, but not the entire game. (Except for this entry on the older Plinko.) And Five-Thirty-Eight surprisingly does not get into the optimal strategies to play this game (maybe because there is none!). Five-Thirty-Eight also reproduces the apocryphal quote of Laplace not requiring this [God] hypothesis.

[Note: When looking for a picture of the Quincunx, I also found this desktop version! Which “allows you to visualize the order embedded in the chaos of randomness”, nothing less. And has even obtain a patent for this “visual aid that demonstrates [sic] a random walk and generates [re-sic] a bell curve distribution”…]