Archive for mixture density network

ABCDE for approximate Bayesian conditional density estimation

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on February 26, 2018 by xi'an

Another arXived paper I surprisingly (?) missed, by George Papamakarios and Iain Murray, on an ABCDE (my acronym!) substitute to ABC for generative models. The paper was reviewed [with reviews made available!] and accepted by NIPS 2016. (Most obviously, I was not one of the reviewers!)

“Conventional ABC algorithms such as the above suffer from three drawbacks. First, they only represent the parameter posterior as a set of (possibly weighted or correlated) samples [for which] it is not obvious how to perform some other computations using samples, such as combining posteriors from two separate analyses. Second, the parameter samples do not come from the correct Bayesian posterior (…) Third, as the ε-tolerance is reduced, it can become impractical to simulate the model enough times to match the observed data even once [when] simulations are expensive to perform”

The above criticisms are a wee bit overly harsh as, well…, Monte Carlo approximations remain a solution worth considering for all Bayesian purposes!, while the approximation [replacing the data with a ball] in ABC is replaced with an approximation of the true posterior as a mixture. Both requiring repeated [and likely expensive] simulations. The alternative is in iteratively simulating from pseudo-predictives towards learning better pseudo-posteriors, then used as new proposals at the next iteration modulo an importance sampling correction.  The approximation to the posterior chosen therein is a mixture density network, namely a mixture distribution with parameters obtained as neural networks based on the simulated pseudo-observations. Which the authors claim [p.4] requires no tuning. (Still, there are several aspects to tune, from the number of components to the hyper-parameter λ [p.11, eqn (35)], to the structure of the neural network [20 tanh? 50 tanh?], to the number of iterations, to the amount of X checking. As usual in NIPS papers, it is difficult to assess how arbitrary the choices made in the experiments are. Unless one starts experimenting with the codes provided.) All in all, I find the paper nonetheless exciting enough (!) to now start a summer student project on it in Dauphine and hope to check the performances of ABCDE on different models, as well as comparing this ABC implementation with a synthetic likelihood version.

 As an addendum, let me point out the very pertinent analysis of this paper by Dennis Prangle, 18 months ago!