**I**n the most recent Bayesian Analysis, Marko Järvenpää et al. (including my coauthor Aki Vehtari) consider an ABC setting where the number of available simulations of pseudo-samples is limited. And where they want to quantify the amount of uncertainty resulting from the estimation of the ABC posterior density. Which is a version of the Monte Carlo error in practical ABC, in that this is the difference between the ABC posterior density for a given choice of summaries and a given choice of tolerance, and the actual approximation based on a finite number of simulations from the prior predictive. As in earlier works by Michael Gutmann and co-authors, the focus stands in designing a sequential strategy to decide where to sample the next parameter value towards minimising a certain expected loss. And in adopting a Gaussian process modelling for the discrepancy between observed data and simulated data, hence generalising the synthetic likelihood approach. This allows them to compute the expectation and the variance of the unnormalised ABC posterior, based on plugged-in estimators. From where the authors derive a loss as the expected variance of the acceptance probability (although it is not parameterisation invariant). I am unsure I see the point for this choice in that there is no clear reason for the resulting sequence of parameter choices to explore the support of the posterior distribution in a relatively exhaustive manner. The paper also mentions alternatives where the next parameter is chosen at the location where “the uncertainty of the unnormalised ABC posterior is highest”. Which sounds more pertinent to me. And further avoids integrating out the parameter. I also wonder if ABC mis-specification analysis could apply in this framework since the Gaussian process is most certainly a “wrong” model. (When concluding this post, I realised I had written a similar entry two years ago about the earlier version of the paper!)

## Archive for prior predictive

## uncertainty in the ABC posterior

Posted in Statistics with tags ABC, Bayesian Analysis, Gaussian processes, misspecified model, Monte Carlo error, prior predictive, synthetic likelihood on July 24, 2019 by xi'an## leave Bayes factors where they once belonged

Posted in Statistics with tags Bayes factors, Bayesian Analysis, Bayesian decision theory, cross validated, prior comparison, prior predictive, prior selection, The Bayesian Choice, The Beatles, using the data twice, xkcd on February 19, 2019 by xi'an**I**n the past weeks I have received and read several papers (and X validated entries)where the Bayes factor is used to compare priors. Which does not look right to me, not on the basis of my general dislike of Bayes factors!, but simply because this seems to clash with the (my?) concept of Bayesian model choice and also because data should not play a role in that situation, from being used to select a *prior*, hence at least twice to run the inference, to resort to a *single* parameter value (namely the one behind the data) to decide between two distributions, to having no asymptotic justification, to eventually favouring the prior concentrated on the maximum likelihood estimator. And more. But I fear that this reticence to test for prior adequacy also extends to the prior predictive, or Box’s p-value, namely the probability under this prior predictive to observe something “more extreme” than the current observation, to quote from David Spiegelhalter.

## back to the Bayesian Choice

Posted in Books, Kids, Statistics, University life with tags autoregressive model, Bayesian decision theory, Book, exercises, improper posteriors, improper prior, inverse Gamma distribution, prior predictive, The Bayesian Choice on October 17, 2018 by xi'an**S**urprisingly (or not?!), I received two requests about some exercises from The Bayesian Choice, one from a group of students from McGill having difficulties solving the above, wondering about the properness of the posterior (but missing the integration of x), to whom I sent back this correction. And another one from the Czech Republic about a difficulty with the term “evaluation” by which I meant (pardon my French!) estimation.

## agent-based models

Posted in Books, pictures, Statistics with tags agent-based models, climate change, data, drug users, Nature, Philadelphia, prior predictive, Sims, simulation model on October 2, 2018 by xi'an**A**n August issue of Nature I recently browsed [on my NUS trip] contained a news feature on agent- based models applied to understanding the opioid crisis in US. (With a rather sordid picture of a drug injection in Philadelphia, hence my own picture.)

To create an agent-based model, researchers first ‘build’ a virtual town or region, sometimes based on a real place, including buildings such as schools and food shops. They then populate it with agents, using census data to give each one its own characteristics, such as age, race and income, and to distribute the agents throughout the virtual town.The agents are autonomous but operate within pre-programmed routines — going to work five times a week, for instance. Some behaviours may be more random, such as a 5% chance per day of skipping work, or a 50% chance of meeting a certain person in the agent’s network. Once the system is as realistic as possible, the researchers introduce a variable such as a flu virus, with a rate and pattern of spread based on its real-life characteristics. They then run the simulation to test how the agents’ behaviour shifts when a school is closed or a vaccination campaign is started, repeating it thousands of times to determine the likelihood of different outcomes.

While I am obviously supportive of simulation based solutions, I cannot but express some reservation at the outcome, given that it is the product of the assumptions in the model. In Bayesian terms, this is purely prior predictive rather than posterior predictive. There is no hard data to create “realism”, apart from the census data. (The article also mixes the outcome of the simulation with real data. Or epidemiological data, not yet available according to the authors.)

In response to the opioid epidemic, Bobashev’s group has constructed Pain Town — a generic city complete with 10,000 people suffering from chronic pain, 70 drug dealers, 30 doctors, 10 emergency rooms and 10 pharmacies. The researchers run the model over five simulated years, recording how the situation changes each virtual day.

This is not to criticise the use of such tools to experiment with social, medical or political interventions, which practically and ethically cannot be tested in real life and working with such targeted versions of the Sims game can paradoxically be more convincing when dealing with policy makers. If they do not object at the artificiality of the outcome, as they often do for climate change models. Just from reading this general public article, I thus wonder at whether model selection and validation tools are implemented in conjunction with agent-based models…

## JSM 2018 [#3]

Posted in Mountains, Statistics, Travel, University life with tags ABC, Approximate Bayesian computation, Bayesian network, Bayesian p-values, British Columbia, Canada, curse of dimensionality, JSM 2018, prior predictive, pseudo-marginal MCMC, spectral analysis, spike-and-slab prior, stochastic gradient descent, Vancouver, variational Bayes methods on August 1, 2018 by xi'an**A**s I skipped day #2 for climbing, here I am on day #3, attending JSM 2018, with a [fully Canadian!] session on (conditional) copula (where Bruno Rémillard talked of copulas for mixed data, with unknown atoms, which sounded like an impossible target!), and another on four highlights from Bayesian Analysis, (the journal), with Maria Terres defending the (often ill-considered!) spectral approach within Bayesian analysis, modelling spectral densities (Fourier transforms of correlations functions, not probability densities), an advantage compared with MCAR modelling being the automated derivation of dependence graphs. While the spectral ghost did not completely dissipate for me, the use of DIC that she mentioned at the very end seems to call for investigation as I do not know of well-studied cases of complex dependent data with clearly specified DICs. Then Chris Drobandi was speaking of ABC being used for prior choice, an idea I vaguely remember seeing quite a while ago as a referee (or another paper!), paper in BA that I missed (and obviously did not referee). Using the same reference table works (for simple ABC) with different datasets but also different priors. I did not get first the notion that the reference table also produces an evaluation of the marginal distribution but indeed the entire simulation from prior x generative model gives a Monte Carlo representation of the marginal, hence the evidence at the observed data. Borrowing from Evans’ fringe Bayesian approach to model choice by prior predictive check for prior-model conflict. I remain sceptic or at least agnostic on the notion of using data to compare priors. And here on using ABC in tractable settings.

The afternoon session was [a mostly Australian] Advanced Bayesian computational methods, with Robert Kohn on variational Bayes, with an interesting comparison of (exact) MCMC and (approximative) variational Bayes results for some species intensity and the remark that forecasting may be much more tolerant to the approximation than estimation. Making me wonder at a possibility of assessing VB on the marginals manageable by MCMC. Unless I miss a complexity such that the decomposition is impossible. And Antonietta Mira on estimating time-evolving networks estimated by ABC (which Anto first showed me in Orly airport, waiting for her plane!). With a possibility of a zero distance. Next talk by Nadja Klein on impicit copulas, linked with shrinkage properties I was unaware of, including the case of spike & slab copulas. Michael Smith also spoke of copulas with discrete margins, mentioning a version with continuous latent variables (as I thought could be done during the first session of the day), then moving to variational Bayes which sounds quite popular at JSM 2018. And David Gunawan made a presentation of a paper mixing pseudo-marginal Metropolis with particle Gibbs sampling, written with Chris Carter and Robert Kohn, making me wonder at their feature of using the white noise as an auxiliary variable in the estimation of the likelihood, which is quite clever but seems to get against the validation of the pseudo-marginal principle. *(Warning: I have been known to be wrong!)*