Archive for Monash University

robust Bayesian synthetic likelihood

Posted in Statistics with tags , , , , , , , , , , , , , on May 16, 2019 by xi'an

David Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is misspecified, even were the Normal distribution the “true” model or an appropriately converging pseudo-likelihood, the simulation based evaluation of the first two moments of the Normal is biased. Of course, for a choice of a summary statistic with limited information, the model can still be weakly compatible with the data in that there exists a pseudo-true value of the parameter θ⁰ for which the synthetic mean μ(θ⁰) is the mean of the statistics. (Sorry if this explanation of mine sounds unclear!) Or rather the Monte Carlo estimate of μ(θ⁰) coincidences with that mean.The same Normal toy example as in our paper leads to very poor performances in the MCMC exploration of the (unsympathetic) synthetic target. The robustification of the approach as proposed in the paper is to bring in an extra parameter to correct for the bias in the mean, using an additional Laplace prior on the bias to aim at sparsity. Or the same for the variance matrix towards inflating it. This over-parameterisation of the model obviously avoids the MCMC to get stuck (when implementing a random walk Metropolis with the target as a scale).

auxiliary likelihood ABC in print

Posted in Statistics with tags , , , , , , , , on March 1, 2019 by xi'an

Our paper with Gael Martin, Brendan McCabe , David Frazier and Worapree Maneesoonthorn, with full title Auxiliary Likelihood-Based Approximate Bayesian Computation in State Space Models, has now appeared in JCGS. To think that it started in Rimini in 2009, when I met Gael for the first time at the Rimini Bayesian Econometrics conference, although we really started working on the paper in 2012 when I visited Monash makes me realise the enormous investment we made in this paper, especially by Gael whose stamina and enthusiasm never cease to amaze me!

risk-adverse Bayes estimators

Posted in Books, pictures, Statistics with tags , , , , , , , , , , on January 28, 2019 by xi'an

An interesting paper came out on arXiv in early December, written by Michael Brand from Monash. It is about risk-adverse Bayes estimators, which are defined as avoiding the use of loss functions (although why avoiding loss functions is not made very clear in the paper). Close to MAP estimates, they bypass the dependence of said MAPs on parameterisation by maximising instead π(θ|x)/√I(θ), which is invariant by reparameterisation if not by a change of dominating measure. This form of MAP estimate is called the Wallace-Freeman (1987) estimator [of which I never heard].

The formal definition of a risk-adverse estimator is still based on a loss function in order to produce a proper version of the probability to be “wrong” in a continuous environment. The difference between estimator and true value θ, as expressed by the loss, is enlarged by a scale factor k pushed to infinity. Meaning that differences not in the immediate neighbourhood of zero are not relevant. In the case of a countable parameter space, this is essentially producing the MAP estimator. In the continuous case, for “well-defined” and “well-behaved” loss functions and estimators and density, including an invariance to parameterisation as in my own intrinsic losses of old!, which the author calls likelihood-based loss function,  mentioning f-divergences, the resulting estimator(s) is a Wallace-Freeman estimator (of which there may be several). I did not get very deep into the study of the convergence proof, which seems to borrow more from real analysis à la Rudin than from functional analysis or measure theory, but keep returning to the apparent dependence of the notion on the dominating measure, which bothers me.

down-under ABC paper accepted in JCGS!

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , on October 25, 2018 by xi'an

Great news!, the ABC paper we had originally started in 2012 in Melbourne with Gael Martin and Brendan MacCabe, before joining forces with David Frazier and Worapree Maneesoothorn, in expanding its scope to using auxiliary likelihoods to run ABC in state-space models, just got accepted in the Journal of Computational and Graphical Statistics. A reason to celebrate with a Mornington Peninsula Pinot Gris wine next time I visit Monash!

ABC forecasts

Posted in Books, pictures, Statistics with tags , , , , , , , , on January 9, 2018 by xi'an

My friends and co-authors David Frazier, Gael Martin, Brendan McCabe, and Worapree Maneesoonthorn arXived a paper on ABC forecasting at the turn of the year. ABC prediction is a natural extension of ABC inference in that, provided the full conditional of a future observation given past data and parameters is available but the posterior is not, ABC simulations of the parameters induce an approximation of the predictive. The paper thus considers the impact of this extension on the precision of the predictions. And argues that it is possible that this approximation is preferable to running MCMC in some settings. A first interesting result is that using ABC and hence conditioning on an insufficient summary statistic has no asymptotic impact on the resulting prediction, provided Bayesian concentration of the corresponding posterior takes place as in our convergence paper under revision.

“…conditioning inference about θ on η(y) rather than y makes no difference to the probabilistic statements made about [future observations]”

The above result holds both in terms of convergence in total variation and for proper scoring rules. Even though there is always a loss in accuracy in using ABC. Now, one may think this is a direct consequence of our (and others) earlier convergence results, but numerical experiments on standard time series show the distinct feature that, while the [MCMC] posterior and ABC posterior distributions on the parameters clearly differ, the predictives are more or less identical! With a potential speed gain in using ABC, although comparing parallel ABC versus non-parallel MCMC is rather delicate. For instance, a preliminary parallel ABC could be run as a burnin’ step for parallel MCMC, since all chains would then be roughly in the stationary regime. Another interesting outcome of these experiments is a case when the summary statistics produces a non-consistent ABC posterior, but still leads to a very similar predictive, as shown on this graph.This unexpected accuracy in prediction may further be exploited in state space models, towards producing particle algorithms that are greatly accelerated. Of course, an easy objection to this acceleration is that the impact of the approximation is unknown and un-assessed. However, such an acceleration leaves room for multiple implementations, possibly with different sets of summaries, to check for consistency over replicates.

Xi’an cuisine [Xi’an series]

Posted in Statistics with tags , , , , , , , , , , , on August 26, 2017 by xi'an

David Frazier sent me a picture of another Xi’an restaurant he found near the campus of Monash University. If this CNN webpage on the ten best dishes in Xi’an is to be believed, this will be a must-go restaurant for my next visit to Melbourne! Especially when reading there that Xi’an claims to have xiaolongbao (soup dumplings) that are superior to those in Shanghai!!! (And when considering that I once went on a xiaolongbao rampage in downtown Melbourne.

model misspecification in ABC

Posted in Statistics with tags , , , , , , , , on August 21, 2017 by xi'an

With David Frazier and Judith Rousseau, we just arXived a paper studying the impact of a misspecified model on the outcome of an ABC run. This is a question that naturally arises when using ABC, but that has been not directly covered in the literature apart from a recently arXived paper by James Ridgway [that was earlier this month commented on the ‘Og]. On the one hand, ABC can be seen as a robust method in that it focus on the aspects of the assumed model that are translated by the [insufficient] summary statistics and their expectation. And nothing else. It is thus tolerant of departures from the hypothetical model that [almost] preserve those moments. On the other hand, ABC involves a degree of non-parametric estimation of the intractable likelihood, which may sound even more robust, except that the likelihood is estimated from pseudo-data simulated from the “wrong” model in case of misspecification.

In the paper, we examine how the pseudo-true value of the parameter [that is, the value of the parameter of the misspecified model that comes closest to the generating model in terms of Kullback-Leibler divergence] is asymptotically reached by some ABC algorithms like the ABC accept/reject approach and not by others like the popular linear regression [post-simulation] adjustment. Which suprisingly concentrates posterior mass on a completely different pseudo-true value. Exploiting our recent assessment of ABC convergence for well-specified models, we show the above convergence result for a tolerance sequence that decreases to the minimum possible distance [between the true expectation and the misspecified expectation] at a slow enough rate. Or that the sequence of acceptance probabilities goes to zero at the proper speed. In the case of the regression correction, the pseudo-true value is shifted by a quantity that does not converge to zero, because of the misspecification in the expectation of the summary statistics. This is not immensely surprising but we hence get a very different picture when compared with the well-specified case, when regression corrections bring improvement to the asymptotic behaviour of the ABC estimators. This discrepancy between two versions of ABC can be exploited to seek misspecification diagnoses, e.g. through the acceptance rate versus the tolerance level, or via a comparison of the ABC approximations to the posterior expectations of quantities of interest which should diverge at rate Vn. In both cases, ABC reference tables/learning bases can be exploited to draw and calibrate a comparison with the well-specified case.