Archive for ELFI

ABC in Svalbard [#1]

Posted in Books, Mountains, pictures, R, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , on April 13, 2021 by xi'an

It started a bit awkwardly for me as I ran late, having accidentally switched to UK time the previous evening (despite a record-breaking biking-time to the University!), then the welcome desk could not find the key to the webinar room and I ended up following the first session from my office, by myself (and my teapot)… Until we managed to reunite in the said room (with an air quality detector!).

Software sessions are rather difficult to follow and I wonder what the idea on-line version should be. We could borrow from our teaching experience new-gained from the past year, where we had to engage students without the ability to roam the computer lab and look at their screens to force engage them into coding. It is however unrealistic to run a computer lab, unless a few “guinea pigs” could be selected in advance and show their progress or lack thereof during the session. In any case, thanks to the speakers who made the presentations of

  1. BSL(R)
  2. ELFI (Python)
  3. ABCpy (Python)

this morning/evening. (Just taking the opportunity to point out the publication of the latest version of DIYABC!).

Florence Forbes’ talk on using mixture of experts was quite alluring (and generated online discussions during the break, recovering some of the fun in real conferences), esp. from my longtime interest normalising flows in mixtures of regression (and more to come as part of our biweekly reading group!). Louis talked about gaining efficiency by not resampling the entire data in large network models. Edwin Fong brought martingales and infinite dimension distributions to the rescue, generalising Polya urns! And Justin Alsing discussed the advantages of estimating the likelihood rather than estimating the posterior, which sounds counterintuitive. With a return to mixtures as approximations, using instead normalising flows. With the worth-repeating message that ABC marginalises over nuisance parameters so easily! And a nice perspective on ABayesian decision, which does not occur that often in the ABC literature. Cecilia Viscardi made a link between likelihood estimation and large deviations à la Sanov, the rare event being associated with the larger distances, albeit dependent on a primary choice of the tolerance. Michael Gutmann presented an intringuing optimisation Monte Carlo approach from his last year AISTATS 2020 paper, the simulated parameter being defined by a fiducial inversion. Reweighted by the prior times a Jacobian term, which stroke me as a wee bit odd, ie using two distributions on θ. And Rito concluded the day by seeking approximate sufficient statistics by constructing exponential families whose components are themselves parameterised as neural networks with neural parameter ω. Leading to an unnormalised model because of the energy function, hence to the use of inference techniques on ω that do not require the constant, like Gutmann & Hyvärinen (2012). And using the (pseudo-)sufficient statistic as ABCsummary statistic. Which still requires an exchange MCMC step within ABC.

Elves to the ABC rescue!

Posted in Books, Kids, Statistics with tags , , , , , , on November 7, 2018 by xi'an

Marko Järvenpää, Michael Gutmann, Arijus Pleska, Aki Vehtari, and Pekka Marttinen have written a paper on Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation soon to appear in Bayesian Analysis that gives me the right nudge to mention the ELFI software they have been contributing to for a while. Where the acronym stands for engine for likelihood-free inference. Written in Python, DAG based, and covering methods like the

  • ABC rejection sampler
  • Sequential Monte Carlo ABC sampler
  • Bayesian Optimization for Likelihood-Free Inference (BOLFI) framework
  • Bayesian Optimization (not likelihood-free)
  • No-U-Turn-Sampler (not likelihood-free)

[Warning: I did not experiment with the software! Feel free to share.]

“…little work has focused on trying to quantify the amount of uncertainty in the estimator of the ABC posterior density under the chosen modelling assumptions. This uncertainty is due to a finite computational budget to perform the inference and could be thus also called as computational uncertainty.”

The paper is about looking at the “real” ABC distribution, that is, the one resulting from a realistic perspective of a finite number of simulations and acceptances. By acquisition, the authors mean an efficient way to propose the next value of the parameter θ, towards minimising the uncertainty in the ABC density estimate. Note that this involves a loss function that must be chosen by the analyst and then available for the minimisation program. If this sounds complicated…

“…our interest is to design the evaluations to minimise the uncertainty in a quantity that itself describes the uncertainty of the parameters of a costly simulation model.”

it indeed is and it requires modelling choices. As in Guttman and Corander (2016), which was also concerned by designing the location of the learning parameters, the modelling is based here on a Gaussian process for the discrepancy between the observed and the simulated data. Which provides an estimate of the likelihood, later used for selecting the next sampling value of θ. The final ABC sample is however produced by a GP estimation of the ABC distribution.As noted by the authors, the method may prove quite time consuming: for instance, one involved model required one minute of computation time for selecting the next evaluation location. (I had a bit of a difficulty when reading the paper as I kept hitting notions that are local to the paper but not immediately or precisely defined. As “adequation function” [p.11] or “discrepancy”. Maybe correlated with short nights while staying at CIRM for the Masterclass, always waking up around 4am for unknown reasons!)