Archive for JCGS

ABC by QMC

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , on November 5, 2018 by xi'an

A paper by Alexander Buchholz (CREST) and Nicolas Chopin (CREST) on quasi-Monte Carlo methods for ABC is going to appear in the Journal of Computational and Graphical Statistics. I had missed the opportunity when it was posted on arXiv and only became aware of the paper’s contents when I reviewed Alexander’s thesis for the doctoral school. The fact that the parameters are simulated (in ABC) from a prior that is quite generally a standard distribution while the pseudo-observations are simulated from a complex distribution (associated with the intractability of the likelihood function) means that the use of quasi-Monte Carlo sequences is in general only possible for the first part.

The ABC context studied there is close to the original version of ABC rejection scheme [as opposed to SMC and importance versions], the main difference standing with the use of M pseudo-observations instead of one (of the same size as the initial data). This repeated version has been discussed and abandoned in a strict Monte Carlo framework in favor of M=1 as it increases the overall variance, but the paper uses this version to show that the multiplication of pseudo-observations in a quasi-Monte Carlo framework does not increase the variance of the estimator. (Since the variance apparently remains constant when taking into account the generation time of the pseudo-data, we can however dispute the interest of this multiplication, except to produce a constant variance estimator, for some targets, or to be used for convergence assessment.) L The article also covers the bias correction solution of Lee and Latuszyǹski (2014).

Due to the simultaneous presence of pseudo-random and quasi-random sequences in the approximations, the authors use the notion of mixed sequences, for which they extend a one-dimension central limit theorem. The paper focus on the estimation of Z(ε), the normalization constant of the ABC density, ie the predictive probability of accepting a simulation which can be estimated at a speed of O(N⁻¹) where N is the number of QMC simulations, is a wee bit puzzling as I cannot figure the relevance of this constant (function of ε), especially since the result does not seem to generalize directly to other ABC estimators.

A second half of the paper considers a sequential version of ABC, as in ABC-SMC and ABC-PMC, where the proposal distribution is there  based on a Normal mixture with a small number of components, estimated from the (particle) sample of the previous iteration. Even though efficient techniques for estimating this mixture are available, this innovative step requires a calculation time that should be taken into account in the comparisons. The construction of a decreasing sequence of tolerances ε seems also pushed beyond and below what a sequential approach like that of Del Moral, Doucet and Jasra (2012) would produce, it seems with the justification to always prefer the lower tolerances. This is not necessarily the case, as recent articles by Li and Fearnhead (2018a, 2018b) and ours have shown (Frazier et al., 2018). Overall, since ABC methods are large consumers of simulation, it is interesting to see how the contribution of QMC sequences results in the reduction of variance and to hope to see appropriate packages added for standard distributions. However, since the most consuming part of the algorithm is due to the simulation of the pseudo-data, in most cases, it would seem that the most relevant focus should be on QMC add-ons on this part, which may be feasible for models with a huge number of standard auxiliary variables as for instance in population evolution.

down-under ABC paper accepted in JCGS!

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , on October 25, 2018 by xi'an

Great news!, the ABC paper we had originally started in 2012 in Melbourne with Gael Martin and Brendan MacCabe, before joining forces with David Frazier and Worapree Maneesoothorn, in expanding its scope to using auxiliary likelihoods to run ABC in state-space models, just got accepted in the Journal of Computational and Graphical Statistics. A reason to celebrate with a Mornington Peninsula Pinot Gris wine next time I visit Monash!

Gibbs for incompatible kids

Posted in Books, Statistics, University life with tags , , , , , , , , , , on September 27, 2018 by xi'an

In continuation of my earlier post on Bayesian GANs, which resort to strongly incompatible conditionals, I read a 2015 paper of Chen and Ip that I had missed. (Published in the Journal of Statistical Computation and Simulation which I first confused with JCGS and which I do not know at all. Actually, when looking at its editorial board,  I recognised only one name.) But the study therein is quite disappointing and not helping as it considers Markov chains on finite state spaces, meaning that the transition distributions are matrices, meaning also that convergence is ensured if these matrices have no null probability term. And while the paper is motivated by realistic situations where incompatible conditionals can reasonably appear, the paper only produces illustrations on two and three states Markov chains. Not that helpful, in the end… The game is still afoot!

weakly informative reparameterisations

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on February 14, 2018 by xi'an

Our paper, weakly informative reparameterisations of location-scale mixtures, with Kaniav Kamary and Kate Lee, got accepted by JCGS! Great news, which comes in perfect timing for Kaniav as she is currently applying for positions. The paper proposes a unidimensional mixture Bayesian modelling based on the first and second moment constraints, since these turn the remainder of the parameter space into a compact. While we had already developed an associated R package, Ultimixt, the current editorial policy of JCGS imposes the R code used to produce all results to be attached to the submission and it took us a few more weeks than it should have to produce a directly executable code, due to internal library incompatibilities. (For this entry, I was looking for a link to our special JCGS issue with my picture of Edinburgh but realised I did not have this picture.)

scalable Bayesian inference for the inverse temperature of a hidden Potts model

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , on April 7, 2015 by xi'an

Brisbane, summer 2008Matt Moores, Tony Pettitt, and Kerrie Mengersen arXived a paper yesterday comparing different computational approaches to the processing of hidden Potts models and of the intractable normalising constant in the Potts model. This is a very interesting paper, first because it provides a comprehensive survey of the main methods used in handling this annoying normalising constant Z(β), namely pseudo-likelihood, the exchange algorithm, path sampling (a.k.a., thermal integration), and ABC. A massive simulation experiment with individual simulation times up to 400 hours leads to select path sampling (what else?!) as the (XL) method of choice. Thanks to a pre-computation of the expectation of the sufficient statistic E[S(Z)|β].  I just wonder why the same was not done for ABC, as in the recent Statistics and Computing paper we wrote with Matt and Kerrie. As it happens, I was actually discussing yesterday in Columbia of potential if huge improvements in processing Ising and Potts models by approximating first the distribution of S(X) for some or all β before launching ABC or the exchange algorithm. (In fact, this is a more generic desiderata for all ABC methods that simulating directly if approximately the summary statistics would being huge gains in computing time, thus possible in final precision.) Simulating the distribution of the summary and sufficient Potts statistic S(X) reduces to simulating this distribution with a null correlation, as exploited in Cucala and Marin (2013, JCGS, Special ICMS issue). However, there does not seem to be an efficient way to do so, i.e. without reverting to simulating the entire grid X…

MCMC convergence assessment

Posted in Books, pictures, R, Statistics, Travel, University life with tags , , , , , , on November 28, 2012 by xi'an

Richard Everitt tweetted yesterday about a recent publication in JCGS by Rajib Paul, Steve MacEachern and Mark Berliner on convergence assessment via stratification. (The paper is free-access.) Since this is another clear interest of mine’s, I had a look at the paper in the train to Besançon. (And wrote this post as a result.)

The idea therein is to compare the common empirical average with a weighted average relying on a partition of the parameter space: restricted means are computed for each element of the partition and then weighted by the probability of the element. Of course, those probabilities are generally unknown and need to be estimated simultaneously. If applied as is, this idea reproduces the original empirical average! So the authors use instead batches of simulations and corresponding estimates, weighted by the overall estimates of the probabilities, in which case the estimator differs from the original one. The convergence assessment is then to check both estimates are comparable. Using for instance Galin Jone’s batch method since they have the same limiting variance. (I thought we mentioned this damning feature in Monte Carlo Statistical Methods, but cannot find a trace of it except in my lecture slides…)

The difference between both estimates is the addition of weights p_in/q_ijn, made of the ratio of the estimates of the probability of the ith element of the partition. This addition thus introduces an extra element of randomness in the estimate and this is the crux of the convergence assessment. I was slightly worried though by the fact that the weight is in essence an harmonic mean, i.e. 1/q_ijn/Σ q_imn… Could it be that this estimate has no finite variance for a finite sample size? (The proofs in the paper all consider the asymptotic variance using the delta method.) However, having the weights adding up to K alleviates my concerns. Of course, as with other convergence assessments, the method is not fool-proof in that tiny, isolated, and unsuspected spikes not (yet) visited by the Markov chain cannot be detected via this comparison of averages.

Parallel computation [revised]

Posted in R, Statistics, University life with tags , , , , , , on March 15, 2011 by xi'an

We have now completed our revision of the parallel computation paper and hope to send it to JCGS within a few days. As seen on the arXiv version, and given the very positive reviews we received, the changes are minor, mostly focusing on the explanation of the principle and on the argument that it comes essentially free. Pierre also redrew the graphs in a more compact and nicer way, thanks to the ggplot2 package abilities. In addition, Pierre put the R and python programs used in the paper on a public depository.