**T**his post is a very preliminary announcement that Jukka Corander, Judith Rousseau and myself are planning an ABC in Svalbard workshop in 2021, on 12-13 April, following the “ABC in…” franchise that started in 2009 in Paris… It would be great to hear expressions of interest from potential participants towards scaling the booking accordingly. (While this is a sequel to the highly productive ABCruise of two years ago, between Helsinki and Stockholm, the meeting will take place in Longyearbyen, Svalbard, and participants will have to fly there from either Oslo or Tromsø, Norway, As boat cruises from Iceland or Greenland start later in the year. Note also that in mid-April, being 80⁰ North, Svalbard enjoys more than 18 hours of sunlight and that the average temperature last April was -3.9⁰C with a high of 4⁰C.) The scientific committee should be constituted very soon, but we already welcome proposals for sessions (and sponsoring, quite obviously!).

## Archive for ABC in Helsinki

## ABC in Ed’burgh

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags ABC, ABC in Edinburgh, ABC in Helsinki, ABC'ory, Arthur's Seat, cruise, Edinburgh, Finland, Scotland, workshop on June 28, 2018 by xi'an**A** glorious day for this new edition of the “ABC in…” workshops, in the capital City of Edinburgh! I enjoyed very much this ABC day for demonstrating ABC is still alive and kicking!, i.e., enjoying plenty of new developments and reinterpretations. With more talks and posters on the way during the main ISBA 2018 meeting. (All nine talks are available on the webpage of the conference.)

After Michael Gutmann’s tutorial on ABC, Gael Martin (Monash) presented her recent work with David Frazier, Ole Maneesoonthorn, and Brendan McCabe on ABC for prediction. Maybe unsurprisingly, Bayesian consistency for the given summary statistics is a sufficient condition for concentration of the ABC predictor, but ABC seems to do better for the prediction problem than for parameter estimation, not losing to exact Bayesian inference, possibly because in essence the summary statistics there need not be of a large dimension to being consistent. The following talk by Guillaume Kon Kam King was also about prediction, for the specific problem of gas offer, with a latent Wright-Fisher point process in the model. He used a population ABC solution to handle this model.

Alexander Buchholz (CREST) introduced an ABC approach with quasi-Monte Carlo steps that helps in reducing the variability and hence improves the approximation in ABC. He also looked at a Negative Geometric variant of regular ABC by running a random number of proposals until reaching a given number of acceptances, which while being more costly produces more stability.

Other talks by Trevelyan McKinley, Marko Järvenpää, Matt Moores (Warwick), and Chris Drovandi (QUT) illustrated the urge of substitute models as a first step, and not solely via Gaussian processes. With for instance the new notion of a loss function to evaluate this approximation. Chris made a case in favour of synthetic vs ABC approaches, due to degradation of the performances of nonparametric density estimation with the dimension. But I remain a doubting Thomas [Bayes] on that point as high dimensions in the data or the summary statistics are not necessarily the issue, as also processed in the paper on ABC-CDE discussed on a recent post. While synthetic likelihood requires estimating a mean function and a covariance function of the parameter of the dimension of the summary statistic. Even though estimated by simulation.

Another neat feature of the day was a special session on cosmostatistics with talks by Emille Ishida and Jessica Cisewski, from explaining how ABC was starting to make an impact on cosmo- and astro-statistics, to the special example of the stellar initial mass distribution in clusters.

Call is now open for the next “ABC in”! Note that, while these workshops have been often formally sponsored by ISBA and its BayesComp section, they are not managed by a society or a board of administrators, and hence are not much contrived by a specific format. It would just be nice to keep the low fees as part of the tradition.

## ABC gas

Posted in pictures, Running, Travel with tags ABC, ABC in Helsinki, brands, Finland, gas station, Helsinki, Munkkiniemen, tramways on August 9, 2017 by xi'an## art brut

Posted in pictures, Travel with tags ABC, ABC in Helsinki, Baltic Sea, boat, cruise, Finland on June 4, 2016 by xi'an## ABC random forests for Bayesian parameter inference

Posted in Books, Kids, R, Statistics, Travel, University life, Wines with tags ABC approximation error, ABC in Helsinki, abcrf, ABCruise, arXiv, Baltic Sea, Bayesian inference, Gulf of Bothnia, Helsinki, Lapin Kulta, out-of-bag correction, R, random forests, reference table, sunrise on May 20, 2016 by xi'an**B**efore leaving Helsinki, we arXived [from the Air France lounge!] the paper Jean-Michel presented on Monday at ABCruise in Helsinki. This paper summarises the experiments Louis conducted over the past months to assess the great performances of a random forest regression approach to ABC parameter inference. Thus validating in this experimental sense the use of this new approach to conducting ABC for Bayesian inference by random forests. (And not ABC model choice as in the Bioinformatics paper with Pierre Pudlo and others.)

I think the major incentives in exploiting the (still mysterious) tool of random forests [against more traditional ABC approaches like Fearnhead and Prangle (2012) on summary selection] are that (i) forests do not require a preliminary selection of the summary statistics, since an arbitrary number of summaries can be used as input for the random forest, even when including a large number of useless white noise variables; (b) there is no longer a tolerance level involved in the process, since the many trees in the random forest define a natural if rudimentary distance that corresponds to being or not being in the same leaf as the observed vector of summary statistics η(y); (c) the size of the reference table simulated from the prior (predictive) distribution does not need to be as large as for in usual ABC settings and hence this approach leads to significant gains in computing time since the production of the reference table usually is the costly part! To the point that deriving a different forest for each univariate transform of interest is truly a minor drag in the overall computing cost of the approach.

An intriguing point we uncovered through Louis’ experiments is that an unusual version of the variance estimator is preferable to the standard estimator: we indeed exposed better estimation performances when using a weighted version of the out-of-bag residuals (which are computed as the differences between the simulated value of the parameter transforms and their expectation obtained by removing the random trees involving this simulated value). Another intriguing feature [to me] is that the regression weights as proposed by Meinshausen (2006) are obtained as an average of the inverse of the number of terms in the leaf of interest. When estimating the posterior expectation of a transform h(θ) given the observed η(y), this summary statistic η(y) ends up in a given leaf for each tree in the forest and all that matters for computing the weight is the number of points from the reference table ending up in this very leaf. I do find this difficult to explain when confronting the case when many simulated points are in the leaf against the case when a single simulated point makes the leaf. This single point ends up being much more influential that all the points in the other situation… While being an outlier of sorts against the prior simulation. But now that I think more about it (after an expensive Lapin Kulta beer in the Helsinki airport while waiting for a change of tire on our airplane!), it somewhat makes sense that rare simulations that agree with the data should be weighted much more than values that stem from the prior simulations and hence do not translate much of an information brought by the observation. (If this sounds murky, blame the beer.) What I found great about this new approach is that it produces a non-parametric evaluation of the cdf of the quantity of interest h(θ) at no calibration cost or hardly any. (An R package is in the making, to be added to the existing R functions of abcrf we developed for the ABC model choice paper.)