**D**avid Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is misspecified, even were the Normal distribution the “true” model or an appropriately converging pseudo-likelihood, the simulation based evaluation of the first two moments of the Normal is biased. Of course, for a choice of a summary statistic with limited information, the model can still be *weakly compatible* with the data in that there exists a pseudo-true value of the parameter θ⁰ for which the synthetic mean μ(θ⁰) is the mean of the statistics. (Sorry if this explanation of mine sounds unclear!) Or rather the Monte Carlo estimate of μ(θ⁰) coincidences with that mean.The same Normal toy example as in our paper leads to very poor performances in the MCMC exploration of the (unsympathetic) synthetic target. The robustification of the approach as proposed in the paper is to bring in an extra parameter to correct for the bias in the mean, using an additional Laplace prior on the bias to aim at sparsity. Or the same for the variance matrix towards inflating it. This over-parameterisation of the model obviously avoids the MCMC to get stuck (when implementing a random walk Metropolis with the target as a scale).

## Archive for Brisbane

## robust Bayesian synthetic likelihood

Posted in Statistics with tags ABC, Australia, Bayesian synthetic likelihood, Brisbane, industrial ruins, MCMC, Melbourne, Metropolis-Hastings algorithm, misspecified model, Monash University, pseudo-likelihood, QUT, summary statistics, Sydney Harbour on May 16, 2019 by xi'an## unbiased consistent nested sampling via sequential Monte Carlo [a reply]

Posted in pictures, Statistics, Travel with tags auxiliary variable, Brisbane, evidence, marginal likelihood, nested sampling, Og, particle filter, QUT, unbiasedness on June 13, 2018 by xi'an*Rob Salomone sent me the following reply on my comments of yesterday about their recently arXived paper.*

“Which never occurred as the number one difficulty there, as the simplest implementation runs a Markov chain from the last removed entry, independently from the remaining entries. Even stationarity is not an issue sinceI believe that the first occurrence within the level set is distributed from the constrained prior.”

“And then, in a twist that is not clearly explained in the paper, the focus moves to an improved nested sampler that moves one likelihood value at a time, with a particle step replacing a singleparticle. (Things get complicated when several particles may take the very same likelihood value, but randomisation helps.) At this stage the algorithm is quite similar to the original nested sampler. Except for the unbiased estimation of the constants, thefinal constant, and the replacement of exponential weights exp(-t/N) by powers of (N-1/N)”

**is**a special case of SMC (with the weights replaced with a suboptimal choice).

## unbiased consistent nested sampling via sequential Monte Carlo

Posted in pictures, Statistics, Travel with tags auxiliary variable, Brisbane, evidence, marginal likelihood, nested sampling, Og, particle filter, QUT, unbiasedness on June 12, 2018 by xi'an

“Moreover, estimates of the marginal likelihood are unbiased.” (p.2)

Rob Salomone, Leah South, Chris Drovandi and Dirk Kroese (from QUT and UQ, Brisbane) recently arXived a paper that frames the nested sampling in such a way that marginal likelihoods can be unbiasedly (and consistently) estimated.

“Why isn’t nested sampling more popular with statisticians?” (p.7)

A most interesting question, especially given its popularity in cosmology and other branches of physics. A first drawback pointed out in the c is the requirement of independence between the elements of the sample produced at each iteration. Which never occurred as the number one difficulty there, as the simplest implementation runs a Markov chain from the last removed entry, independently from the remaining entries. Even stationarity is not an issue since I believe that the first occurrence within the level set is distributed from the constrained prior.

A second difficulty is the use of quadrature which turns integrand into step functions at random slices. Indeed, mixing Monte Carlo with numerical integration makes life much harder, as shown by the early avatars of nested sampling that only accounted for the numerical errors. (And which caused Nicolas and I to write our critical paper in Biometrika.) There are few studies of that kind in the literature, the only one I can think of being [my former PhD student] Anne Philippe‘s thesis twenty years ago.

The third issue stands with the difficulty in parallelising the method. Except by jumping k points at once, rather than going one level at a time. While I agree this makes life more complicated, I am also unsure about the severity of that issue as k nested sampling algorithms can be run in parallel and aggregated in the end, from simple averaging to something more elaborate.

The final blemish is that the nested sampling estimator has a stopping mechanism that induces a truncation error, again maybe a lesser problem given the overall difficulty in assessing the total error.

The paper takes advantage of the ability of SMC to produce unbiased estimates of a sequence of normalising constants (or of the normalising constants of a sequence of targets). For nested sampling, the sequence is made of the prior distribution restricted to an embedded sequence of level sets. With another sequence restricted to bands (likelihood between two likelihood boundaries). If all restricted posteriors of the second kind and their normalising constant are known, the full posterior is known. Apparently up to the main normalising constant, i.e. the marginal likelihood., *ℨ*, except that it is also the sum of all normalising constants. Handling this sequence by SMC addresses the four concerns of the four authors, apart from the truncation issue, since the largest likelihood bound need be set for running the algorithm.

When the sequence of likelihood bounds is chosen based on the observed likelihoods so far, the method becomes adaptive. Requiring again the choice of a stopping rule that may induce bias if stopping occurs too early. And then, in a twist that is not clearly explained in the paper, the focus moves to an improved nested sampler that moves one likelihood value at a time, with a particle step replacing a single particle. (Things get complicated when several particles may take the very same likelihood value, but randomisation helps.) At this stage the algorithm is quite similar to the original nested sampler. Except for the unbiased estimation of the constants, the final constant, and the replacement of exponential weights exp(-t/N) by powers of (N-1/N).

The remainder of this long paper (61 pages!) is dedicated to practical implementation, calibration and running a series of comparisons. A nice final touch is the thanks to the ‘Og for its series of posts on nested sampling, which “helped influence this work, and played a large part in inspiring it.”

In conclusion, this paper is certainly a worthy exploration of the nested sampler, providing further arguments towards a consistent version, with first and foremost an (almost?) unbiased resolution. The comparison with a wide range of alternatives remains open, in particular time-wise, if evidence is the sole target of the simulation. For instance, the choice of this sequence of targets in an SMC may be improved by another sequence, since changing one particle at a time does not sound efficient. The complexity of the implementation and in particular of the simulation from the prior under more and more stringent constraints need to be addressed.

## positions at QUT stats

Posted in Statistics with tags academic position, Australia, Bayesian computation, Brisbane, data science, Gold Coast, postdoctoral position, Queensland University of Technology, QUT on September 4, 2017 by xi'anChris Drovandi sent me the information that the Statistics Group, QUT, Brisbane, is advertising for three positions:

- Professor in Statistical Data Science (Remuneration package from $AUD206,729 per year)
- Lecturer in Statistical Data Science (Remuneration package of $AUD108,796 to $AUD129,209 per year)
- PhD Scholarship in Computational Bayesian Statistics ($AUD35,000 per year tax-free for 3 years, with an additional $AUD5,000 per year for project costs)

This is a great opportunity, a very active group, and a great location, which I visited several times, so if interested apply before October 1.

## Jeff down-under

Posted in Books, Statistics, Travel, University life with tags AMSI Lecture, Australia, Brisbane, Jeff Rosenthal, MCMC, orange, QUT, SSI on September 9, 2016 by xi'an**J**eff Rosenthal is the AMSI-SSA (Australia Mathematical Sciences Institute – Statistical Society of Australia) lecturer this year and, as I did in 2012, will tour Australia giving seminars. Including this one at QUT. Enjoy, if you happen to be down-under!

## scalable Bayesian inference for the inverse temperature of a hidden Potts model

Posted in Books, R, Statistics, University life with tags ABC, Approximate Bayesian computation, Australia, Brisbane, exchange algorithm, Ising model, JCGS, path sampling, Potts model, pseudo-likelihood, QUT, Statistics and Computing on April 7, 2015 by xi'an**M**att Moores, Tony Pettitt, and Kerrie Mengersen arXived a paper yesterday comparing different computational approaches to the processing of hidden Potts models and of the intractable normalising constant in the Potts model. This is a very interesting paper, first because it provides a comprehensive survey of the main methods used in handling this annoying normalising constant Z(β), namely pseudo-likelihood, the exchange algorithm, path sampling (a.k.a., thermal integration), and ABC. A massive simulation experiment with individual simulation times up to 400 hours leads to select path sampling (what else?!) as the (XL) method of choice. Thanks to a pre-computation of the expectation of the sufficient statistic E[S(Z)|β]. I just wonder why the same was not done for ABC, as in the recent Statistics and Computing paper we wrote with Matt and Kerrie. As it happens, I was actually discussing yesterday in Columbia of potential if huge improvements in processing Ising and Potts models by approximating first the distribution of S(X) for some or all β before launching ABC or the exchange algorithm. (In fact, this is a more generic desiderata for all ABC methods that simulating directly if approximately the summary statistics would being huge gains in computing time, thus possible in final precision.) Simulating the distribution of the summary and sufficient Potts statistic S(X) reduces to simulating this distribution with a null correlation, as exploited in Cucala and Marin (2013, JCGS, Special ICMS issue). However, there does not seem to be an efficient way to do so, i.e. without reverting to simulating the entire grid X…

## Statistical modeling and computation [book review]

Posted in Books, R, Statistics, University life with tags ANU, Australia, Bayesian Essentials with R, Bayesian statistics, Brisbane, Dirk Kroese, introductory textbooks, Joshua Chan, Matlab, maximum likelihood estimation, Monte Carlo methods, Monte Carlo Statistical Methods, R, state space model on January 22, 2014 by xi'an**D**irk Kroese (from UQ, Brisbane) and Joshua Chan (from ANU, Canberra) just published a book entitled *Statistical Modeling and Computation*, distributed by Springer-Verlag (I cannot tell which series it is part of from the cover or frontpages…) The book is intended mostly for an undergrad audience (or for graduate students with no probability or statistics background). Given that prerequisite, *Statistical Modeling and Computation* is fairly standard in that it recalls probability basics, the principles of statistical inference, and classical parametric models. In a third part, the authors cover “advanced models” like generalised linear models, time series and state-space models. The specificity of the book lies in the inclusion of simulation methods, in particular MCMC methods, and illustrations by Matlab code boxes. (Codes that are available on the companion website, along with R translations.) It thus has a lot in common with our *Bayesian Essentials with R*, meaning that I am not the most appropriate or least ~~un~~biased reviewer for this book. Continue reading