Archive for likelihood-free inference

the new DIYABC-RF

Posted in Books, pictures, R, Statistics, Wines with tags , , , , , , , , , , , , , , , , on April 15, 2021 by xi'an

My friends and co-authors from Montpellier have released last month the third version of the DIYABC software, DIYABC-RF, which includes and promotes the use of random forests for parameter inference and model selection, in connection with Louis Raynal’s thesis. Intended as the earlier versions of DIYABC for population genetic applications. Bienvenue!!!

The software DIYABC Random Forest (hereafter DIYABC-RF) v1.0 is composed of three parts: the dataset simulator, the Random Forest inference engine and the graphical user interface. The whole is packaged as a standalone and user-friendly graphical application named DIYABC-RF GUI and available at The different developer and user manuals for each component of the software are available on the same website. DIYABC-RF is a multithreaded software on three operating systems: GNU/Linux, Microsoft Windows and MacOS. One can use the program can be used through a modern and user-friendly graphical interface designed as an R shiny application (Chang et al. 2019). For a fluid and simplified user experience, this interface is available through a standalone application, which does not require installing R or any dependencies and hence can be used independently. The application is also implemented in an R package providing a standard shiny web application (with the same graphical interface) that can be run locally as any shiny application, or hosted as a web service to provide a DIYABC-RF server for multiple users.

marginal likelihood as exhaustive X validation

Posted in Statistics with tags , , , , , , , , on October 9, 2020 by xi'an

In the June issue of Biometrika (for which I am deputy editor) Edwin Fong and Chris Holmes have a short paper (that I did not process!) on the validation of the marginal likelihood as the unique coherent updating rule. Marginal in the general sense of Bissiri et al. (2016). Coherent in the sense of being invariant to the order of input of exchangeable data, if in a somewhat self-defining version (Definition 1). As a consequence, marginal likelihood arises as the unique prequential scoring rule under coherent belief updating in the Bayesian framework. (It is unique given the prior or its generalisation, obviously.)

“…we see that 10% of terms contributing to the marginal likelihood come from out-of-sample predictions, using on average less than 5% of the available training data.”

The paper also contains the interesting remark that the log marginal likelihood is the average leave-p-out X-validation score, across all values of p. Which shows that, provided the marginal can be approximated, the X validation assessment is feasible. Which leads to a highly relevant (imho) spotlight on how this expresses the (deadly) impact of the prior selection on the numerical value of the marginal likelihood. Leaving outsome of the least informative terms in the X-validation leads to exactly the log geometric intrinsic Bayes factor of Berger & Pericchi (1996). Most interesting connection with the Bayes factor community but one that depends on the choice of the dismissed fraction of p‘s.

focused Bayesian prediction

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on June 3, 2020 by xi'an

In this fourth session of our One World ABC Seminar, my friend and coauthor Gael Martin, gave an after-dinner talk on focused Bayesian prediction, more in the spirit of Bissiri et al. than following a traditional ABC approach.  because along with Ruben Loaiza-Maya and [my friend and coauthor] David Frazier, they consider the possibility of a (mild?) misspecification of the model. Using thus scoring rules à la Gneiting and Raftery. Gael had in fact presented an earlier version at our workshop in Oaxaca, in November 2018. As in other solutions of that kind, difficulty in weighting the score into a distribution. Although asymptotic irrelevance, direct impact on the current predictions, at least for the early dates in the time series… Further calibration of the set of interest A. Or the focus of the prediction. As a side note the talk perfectly fits the One World likelihood-free seminar as it does not use the likelihood function!

“The very premise of this paper is that, in reality, any choice of predictive class is such that the truth is not contained therein, at which point there is no reason to presume that the expectation of any particular scoring rule will be maximized at the truth or, indeed, maximized by the same predictive distribution that maximizes a different (expected) score.”

This approach requires the proxy class to be close enough to the true data generating model. Or in the word of the authors to be plausible predictive models. And to produce the true distribution via the score as it is proper. Or the closest to the true model in the misspecified family. I thus wonder at a possible extension with a non-parametric version, the prior being thus on functionals rather than parameters, if I understand properly the meaning of Π(Pθ). (Could the score function be misspecified itself?!) Since the score is replaced with its empirical version, the implementation is  resorting to off-the-shelf MCMC. (I wonder for a few seconds if the approach could be seen as a pseudo-marginal MCMC but the estimation is always based on the same observed sample, hence does not directly fit the pseudo-marginal MCMC framework.)

[Notice: Next talk in the series is tomorrow, 11:30am GMT+1.]

ABC in Svalbard [news #1]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , on March 23, 2020 by xi'an

We [Julien and myself] are quite pleased to announce that

  • the scientific committee for the workshop has been gathered
  • the webpage for the workshop is now on-line (with a wonderful walrus picture whose author we alas cannot identify)
  • the workshop is now endorsed by both IMS and ISBA, which will handle registration (to open soon)
  • the reservation of hotel rooms will be handled by Hurtigruten Svalbard through the above webpage (this is important as we already paid deposit for a certain number of rooms)
  • we are definitely seeking both sponsors and organisers of mirror workshops in more populated locations

As an item of trivia, let me recall that Svalbard stands for the archipelago, while Spitsbergen is the name of the main island, where Longyearbyen is located. (In Icelandic, Svalbarði means cold rim or cold coast.)


Posted in Statistics with tags , , , , , , , , on January 24, 2020 by xi'an

On 26 and 27 March 2020, the maths department of the Université of Rouen, Normandy, France, organizes a (free) workshop on mixture distributions. With the following speakers

    • Christophe Biernacki  (Laboratoire Paul Painlevé, Univ. Lille 1 et INRIA)
    • Vincent Brault (Laboratoire Jean Kuntzmann, Univ. Grenoble Alpes)
    • Gilles Celeux  (Laboratoire de Mathématiques d’Orsay, Univ. Paris Sud et INRIA)
    • Elisabeth Gassiat  (Laboratoire de Mathématiques d’Orsay, Univ. Paris Sud)
    • Van Hà Hoang  (Laboratoire de Mathématique Raphaël Salem, Univ. Rouen Normandie)
    • Hajo Holzmann  (Philipps-University Marburg, Germany)
    • Dimitri Karlis  (Department of Statistics, Athens University of Economics and Business, Greece)
    • Trung Tin Nguyen (LMNO, Univ. Caen Normandie)
    • Andrea Rau  (Département de Génétique Animale, INRA, Jouy en Josas)
    • Pierre Vandekerkhove  (Laboratoire d’Analyse et de Mathématiques Appliquées, Univ. Paris-Est Marne-la-Vallée)
    • Cinzia Viroli  (Department of Statistical Sciences, Universita di Bologna, Italia)

Unfortunately, since this is my former department, I will not be able to attend as I am taking part into the SIAM Conference on Uncertainty Quantification (UQ20), on the very same days. In a session on likelihood-free inference.