**H**ere are my slides for the overview talk I am giving at CISEA 2019, in Abidjan, highly resemblant with earlier talks, except for the second slide!

## Archive for misspecified model

## postdoc position still open

Posted in pictures, Statistics, University life with tags ABC, Agence Nationale de la Recherche, ANR, approximate Bayesian inference, bois de Boulogne, La Défense, misspecified model, Paris, Paris-Saclay campus, PhD thesis, postdoctoral position, PSL Research University, Université de Montpellier, Université Paris Dauphine, University of Oxford on May 30, 2019 by xi'an**T**he post-doctoral position supported by the ANR funding of our Paris-Saclay-Montpellier research conglomerate on approximate Bayesian inference and computation remains open for the time being. We are more particularly looking for candidates with a strong background in mathematical statistics, esp. Bayesian non-parametrics, towards the analysis of the limiting behaviour of approximate Bayesian inference. Candidates should email me (gmail address: bayesianstatistics) with a detailed vita (CV) and a motivation letter including a research plan. Letters of recommendation may also be emailed to the same address.

## robust Bayesian synthetic likelihood

Posted in Statistics with tags ABC, Australia, Bayesian synthetic likelihood, Brisbane, industrial ruins, MCMC, Melbourne, Metropolis-Hastings algorithm, misspecified model, Monash University, pseudo-likelihood, QUT, summary statistics, Sydney Harbour on May 16, 2019 by xi'an**D**avid Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is misspecified, even were the Normal distribution the “true” model or an appropriately converging pseudo-likelihood, the simulation based evaluation of the first two moments of the Normal is biased. Of course, for a choice of a summary statistic with limited information, the model can still be *weakly compatible* with the data in that there exists a pseudo-true value of the parameter θ⁰ for which the synthetic mean μ(θ⁰) is the mean of the statistics. (Sorry if this explanation of mine sounds unclear!) Or rather the Monte Carlo estimate of μ(θ⁰) coincidences with that mean.The same Normal toy example as in our paper leads to very poor performances in the MCMC exploration of the (unsympathetic) synthetic target. The robustification of the approach as proposed in the paper is to bring in an extra parameter to correct for the bias in the mean, using an additional Laplace prior on the bias to aim at sparsity. Or the same for the variance matrix towards inflating it. This over-parameterisation of the model obviously avoids the MCMC to get stuck (when implementing a random walk Metropolis with the target as a scale).

## did variational Bayes work?

Posted in Books, Statistics with tags approximate Bayesian inference, asymptotic Bayesian methods, ICML 2018, importance sampling, misspecified model, Pareto distribution, Pareto smoothed importance sampling, posterior predictive, variational Bayes methods, what you get is what you see on May 2, 2019 by xi'an**A**n interesting ICML 2018 paper by Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman I missed last summer on [the fairly important issue of] assessing the quality or lack thereof of a variational Bayes approximation. In the sense of being near enough from the true posterior. The criterion that they propose in this paper relates to the Pareto smoothed importance sampling technique discussed in an earlier post and which I remember discussing with Andrew when he visited CREST a few years ago. The truncation of the importance weights of prior x likelihood / VB approximation avoids infinite variance issues but induces an unknown amount of bias. The resulting diagnostic is based on the estimation of the Pareto order k. If the true value of k is less than ½, the variance of the associated Pareto distribution is finite. The paper suggests to conclude at the worth of the variational approximation when the estimate of k is less than 0.7, based on the empirical assessment of the earlier paper. The paper also contains a remark on the poor performances of the generalisation of this method to marginal settings, that is, when the importance weight is the ratio of the true and variational marginals for a sub-vector of interest. I find the counter-performances somewhat worrying in that Rao-Blackwellisation arguments make me prefer marginal ratios to joint ratios. It may however be due to a poor approximation of the marginal ratio that reflects on the approximation and not on the ratio itself. A second proposal in the paper focus on solely the point estimate returned by the variational Bayes approximation. Testing that the posterior predictive is well-calibrated. This is less appealing, especially when the authors point out the “dissadvantage is that this diagnostic does not cover the case where the observed data is not well represented by the model.” In other words, misspecified situations. This potential misspecification could presumably be tested by comparing the Pareto fit based on the actual data with a Pareto fit based on simulated data. Among other deficiencies, they point that this is “a local diagnostic that will not detect unseen modes”. In other words, *what you get is what you see*.

## absint[he] post-doc on approximate Bayesian inference in Paris, Montpellier and Oxford

Posted in Statistics with tags ABC, Agence Nationale de la Recherche, ANR, approximate Bayesian inference, bois de Boulogne, La Défense, misspecified model, Paris, Paris-Saclay campus, PhD thesis, postdoctoral position, Université de Montpellier, Université Paris Dauphine, University of Oxford on March 18, 2019 by xi'anAs a consequence of its funding by the Agence Nationale de la Recherche (ANR) in 2018, the ABSint research conglomerate is now actively recruiting a post-doctoral collaborator for up to 24 months. The accronym ** ABSint** stands for Approximate Bayesian solutions for inference on large datasets and complex models. The ABSint conglomerate involves researchers located in Paris, Saclay, Montpelliers, as well as Lyon, Marseille, Nice. This call seeks candidates with an excellent research record and who are interested to collaborate with local researchers on approximate Bayesian techniques like ABC, variational Bayes, PAC-Bayes, Bayesian non-parametrics, scalable MCMC, and related topics. A potential direction of research would be the derivation of new Bayesian tools for model checking in such complex environments. The post-doctoral collaborator will be primarily located in Université Paris-Dauphine, with supported periods in Oxford and visits to Montpellier. No teaching duty is attached to this research position.

Applications can be submitted in either English or French. Sufficient working fluency in English is required. While mastering some French does help with daily life in France (!), it is not a prerequisite. The candidate must hold a PhD degree by the date of application (not the date of employment). Position opens on July 01, with possible accommodation for a later start in September or October.

Deadline for application is April 30 or until position filled. Estimated gross salary is around 2500 EUR, depending on experience (years) since PhD. Candidates should contact Christian Robert (gmail address: bayesianstatistics) with a detailed vita (CV) and a motivation letter including a research plan. Letters of recommendation may also be emailed to the same address.

## computational statistics and molecular simulation [18w5023]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags 18w5023, ABC, BIRS, Casa Matemática Oaxaca, CMO, computational statistics, crown of thorns, gerrymandering, HMC, killer robot, lead climbing, leapfrog integrator, Mexico, misspecified model, molecular dynamics, Monte Carlos Statistical Methods, Moreau-Yoshida, numerical integrator, overdamped Langevin algorithm, proximal optimisation, reversible jump MCMC, rock climbing, starfish, summary statistics, transferability, workshop on November 15, 2018 by xi'an **I** truly missed the gist of the first talk of the Wednesday morning of our X fertilisation workshop by Jianfeng Lu partly due to notations, although the topic very much correlated to my interests like path sampling, with an augmented version of HMC using an auxiliary indicator. And mentions made of BAOAB. Next, Marcello Pereyra spoke about Bayesian image analysis, with the difficulty of setting a prior on an image. In case of astronomical images there are motivations for an L¹ penalisation sparse prior. Sampling is an issue. Moreau-Yoshida proximal optimisation is used instead, in connection with our MCMC survey published in Stats & Computing two years ago. *Transferability* was a new concept for me, as introduced by Kerrie Mengersen (QUT), to extrapolate an estimated model to another system without using the posterior as a prior. With a great interlude about the crown of thorns starfish killer robot! Rather a prior determination based on historical data, in connection with recent (2018) Technometrics and Bayesian Analysis papers towards rejecting non-plausible priors. Without reading the papers (!), and before discussing the matter with Kerrie, here or in Marseille, I wonder at which level of precision this can be conducted. The use of summary statistics for prior calibration gave the approach an ABC flavour.

The hand-on session was Jonathan Mattingly’s discussion of gerrymandering reflecting on his experience at court! Hard to beat for an engaging talk reaching between communities. As it happens I discussed the original paper last year. Of course it was much more exciting to listen to Jonathan explaining his vision of the problem! Too bad I “had” to leave before the end for a [most enjoyable] rock climbing afternoon… To be continued at the dinner table! (Plus we got the complete explanation of the term gerrymandering, including this salamander rendering of the first identified as gerrymandered district!)

## computational statistics and molecular simulation [18w5023]

Posted in Statistics with tags 18w5023, BIRS, Casa Matemática Oaxaca, CMO, computational statistics, HMC, leapfrog integrator, Mexico, misspecified model, molecular dynamics, Monte Carlos Statistical Methods, numerical integrator, overdamped Langevin algorithm, reversible jump MCMC, workshop on November 13, 2018 by xi'an**T**his X fertilisation workshop Gabriel Stolz, Luke Bornn and myself organised towards reinforcing the interface between molecular dynamics and Monte Carlo statistical methods has now started! At the casa matematicà Oaxaca, the Mexican campus of BIRS, which is currently housed by a very nice hotel on the heights of Oaxaca. And after a fairly long flight for a large proportion of the participants. On the first day, Arthur Voter gave a fantastic “hand-on” review of molecular dynamics for material sciences, which was aimed at the statistician side of the audience and most helpful in my own understanding of the concepts and techniques at the source of HMC and PDMP algorithms. (Although I could not avoid a few mini dozes induced by jetlag.) Including the BAOAB version of HMC, which sounded to me like an improvement to investigate. The part on metastability, completed by a talk by Florian Maire, remained a wee bit mysterious [to me].

The shorter talks of the day all brought new perspectives and information to me (although they were definitely more oriented towards their “own” side of the audience than the hand-on lecture). For instance, Jesús María Sanz-Serna gave a wide ranging overview of numerical integrators and Tony Lelièvre presented a recent work on simulating measures supported by manifolds via an HMC technique constantly projecting over the manifold, with proper validation. (I had struggled with the paper this summer and this talk helped a lot.) There was a talk by Josh Fash on simulating implicit solvent models that mixed high-level programming and reversible jump MCMC, with an earlier talk by Yong Chen on variable dimension hidden Markov models that could have also alluded to reversible jump. Angela Bito talked about using ASIS (Ancillarity-sufficiency interweaving strategy) for improving the dynamics of an MCMC sampler associated with a spike & slab prior, the recentering-decentering cycle being always a sort of mystery to me [as to why it works better despite introducing multimodality in this case], and Gael Martin presented some new results on her on-going work with David Frazier about approximate Bayes with misspecified models, with the summary statistic being a score function that relates the work to the likelihood free approach of Bissiri et al.