Archive for Brisbane

ABC in Svalbard [update]

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on December 16, 2020 by xi'an

Even though no one can tell at this stage who will be allowed to travel to Svalbard mid April 2021, we are keeping the workshop to physically take place as planned in Longyearbyen. With at least a group of volunteers made of researchers from Oslo (since at the current time, travel between mainland Norway and Svalbard is authorised). The conference room reservation has been confirmed yesterday and there are a few hotel rooms pre-booked through Hurtigrutensvlabard.com. Anyone planning to attend just need to (i) register on the workshop webpage, (ii) book an hotel room for the duration of the workshop (or more)., and (iii) reserve a plane ticket as there are not that many flights planned.

Obviously this option should only attract a few brave souls (from nearby countries). We are thus running at the same time three mirror workshops in Brisbane (QUT), Coventry (University of Warwick), and Grenoble (IMAG & INRIA). Except for Warwick, where the current pandemic restrictions do not allow for a workshop to take place, the mirror workshops will take place in university buildings and be face-to-face (with video connections as well). Julyan Arbel has set-up a mirror webpage as well. With a (free) registration deadline of 31 March, the workshop being open to all who can attend. Hopefully enough of us will gather here or there to keep up with the spirit of the earlier ABC workshops. (To make the mirror places truly ABCesque, it should have been set in A as Autrans rather than Grenoble!)

improving synthetic likelihood

Posted in Books, Statistics, University life with tags , , , , , , , , on July 9, 2020 by xi'an

Chris Drovandi gave an after-dinner [QUT time!] talk for the One World ABC webinar on a recent paper he wrote with Jacob Proddle, Scott Sisson and David Frazier. Using a regular MCMC step on a synthetic likelihood approximation to the posterior. Or a (simulation based) unbiased estimator of it.

By evaluating the variance of the log-likelihood estimator, the authors show that the number of simulations n need scale like n²d² to keep the variance under control. And suggest PCA decorrelation of the summary statistic components as a mean to reduce the variance since it then scales as n²d. Rather idly, I wonder at the final relevance of precisely estimating the (synthetic) likelihood when considering it is not the true likelihood and when the n² part seems more damning. Moving from d² to d seems directly related to the estimation of a full correlation matrix for the Normal synthetic distribution of the summary statistic versus the estimation of a diagonal matrix. The usual complaint that performances highly depend on the choice of the summary statistic also applies here, in particular when its dimension is much larger than the dimension d of the parameter (as in the MA example). Although this does not seem to impact the scale of the variance.

data science [down] under the hood [webinar]

Posted in Statistics with tags , , , , , , on June 21, 2020 by xi'an

nested sampling via SMC

Posted in Books, pictures, Statistics with tags , , , , , , , , , , , , on April 2, 2020 by xi'an

“We show that by implementing a special type of [sequential Monte Carlo] sampler that takes two im-portance sampling paths at each iteration, one obtains an analogous SMC method to [nested sampling] that resolves its main theoretical and practical issues.”

A paper by Queenslander Robert Salomone, Leah South, Chris Drovandi and Dirk Kroese that I had missed (and recovered by Grégoire after we discussed this possibility with our Master students). On using SMC in nested sampling. What are the difficulties mentioned in the above quote?

  1. Dependence between the simulated samples, since only the offending particle is moved by one or several MCMC steps. (And MultiNest is not a foolproof solution.)
  2. The error due to quadrature is hard to evaluate, with parallelised versions aggravating the error.
  3. There is a truncation error due to the stopping rule when the exact maximum of the likelihood function is unknown.

Not mentioning the Monte Carlo error, of course, which should remain at the √n level.

“Nested Sampling is a special type of adaptive SMC algorithm, where weights are assigned in a suboptimal way.”

The above remark is somewhat obvious for a fixed sequence of likelihood levels and a set of particles at each (ring) level. moved by a Markov kernel with the right stationary target. Constrained to move within the ring, which may prove delicate in complex settings. Such a non-adaptive version is however not realistic and hence both the level sets and the stopping rule need be selected from the existing simulation, respectively as a quantile of the observed likelihood and as a failure to modify the evidence approximation, an adaptation that is a Catch 22! as we already found in the AMIS paper.  (AMIS stands for adaptive mixture importance sampling.) To escape the quandary, the authors use both an auxiliary variable (to avoid atoms) and two importance sampling sequences (as in AMIS). And only a single particle with non-zero incremental weight for the (upper level) target. As the full details are a bit fuzzy to me, I hope I can experiment with my (quarantined) students on the full implementation of the method.

“Such cases asides, the question whether SMC is preferable using the TA or NS approach is really one of whether it is preferable to sample (relatively) easy distributions subject to a constraint or to sample potentially difficult distributions.”

A question (why not regular SMC?) I was indeed considering until coming to the conclusion section but did not find it treated in the paper. There is little discussion on the computing requirements either, as it seems the method is more time-consuming than a regular nested sample. (On the personal side,  I appreciated very much their “special thanks to Christian Robert, whose many blog posts on NS helped influence this work, and played a large partin inspiring it.”)

robust Bayesian synthetic likelihood

Posted in Statistics with tags , , , , , , , , , , , , , on May 16, 2019 by xi'an

David Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is misspecified, even were the Normal distribution the “true” model or an appropriately converging pseudo-likelihood, the simulation based evaluation of the first two moments of the Normal is biased. Of course, for a choice of a summary statistic with limited information, the model can still be weakly compatible with the data in that there exists a pseudo-true value of the parameter θ⁰ for which the synthetic mean μ(θ⁰) is the mean of the statistics. (Sorry if this explanation of mine sounds unclear!) Or rather the Monte Carlo estimate of μ(θ⁰) coincidences with that mean.The same Normal toy example as in our paper leads to very poor performances in the MCMC exploration of the (unsympathetic) synthetic target. The robustification of the approach as proposed in the paper is to bring in an extra parameter to correct for the bias in the mean, using an additional Laplace prior on the bias to aim at sparsity. Or the same for the variance matrix towards inflating it. This over-parameterisation of the model obviously avoids the MCMC to get stuck (when implementing a random walk Metropolis with the target as a scale).