On December 10, I will alas not travel to London to attend the Read Paper on sequential quasi-Monte Carlo presented by Mathieu Gerber and Nicolas Chopin to The Society, as I fly instead to Montréal for the NIPS workshops… I am quite sorry to miss this event, as this is a major paper which brings quasi-Monte Carlo methods into mainstream statistics. I will most certainly write a discussion and remind Og’s readers that contributed (800 words) discussions are welcome from everyone, the deadline for submission being January 02.
Archive for MCQMC
Another paper addressing the estimation of the normalising constant and the wealth of available solutions just came out on arXiv, with the full title of “Target density normalization for Markov chain Monte Carlo algorithms“, written by Allen Caldwell and Chang Liu. (I became aware of it by courtesy of Ewan Cameron, as it appeared in the physics section of arXiv. It is actually a wee bit annoying that papers in the subcategory “Data Analysis, Statistics and Probability” of physics do not get an automated reposting on the statistics lists…)
In this paper, the authors compare three approaches to the problem of finding
when the density f is unormalised, i.e., in more formal terms, when f is proportional to a probability density (and available):
- an “arithmetic mean”, which is an importance sampler based on (a) reducing the integration volume to a neighbourhood ω of the global mode. This neighbourhood is chosen as an hypercube and the importance function turns out to be the uniform over this hypercube. The corresponding estimator is then a rescaled version of the average of f over uniform simulations in ω.
- an “harmonic mean”, of all choices!, with again an integration over the neighbourhood ω of the global mode in order to avoid the almost sure infinite variance of harmonic mean estimators.
- a Laplace approximation, using the target at the mode and the Hessian at the mode as well.
The paper then goes to comparing those three solutions on a few examples, demonstrating how the diameter of the hypercube can be calibrated towards a minimum (estimated) uncertainty. The rather anticlimactic conclusion is that the arithmetic mean is the most reliable solution as harmonic means may fail in larger dimension and more importantly fail to signal its failure, while Laplace approximations only approximate well quasi-Gaussian densities…
What I find most interesting in this paper is the idea of using only one part of the integration space to compute the integral, even though it is not exactly new. Focussing on a specific region ω has pros and cons, the pros being that the reduction to a modal region reduces needs for absolute MCMC convergence and helps in selecting alternative proposals and also prevents from the worst consequences of using a dreaded harmonic mean, the cons being that the region needs be well-identified, which means requirements on the MCMC kernel, and that the estimate is a product of two estimates, the frequency being driven by a Binomial noise. I also like very much the idea of calibrating the diameter Δof the hypercube ex-post by estimating the uncertainty.
As an aside, the paper mentions most of the alternative solutions I just presented in my Monte Carlo graduate course two days ago (like nested or bridge or Rao-Blackwellised sampling, including our proposal with Darren Wraith), but dismisses them as not “directly applicable in an MCMC setting”, i.e., without modifying this setting. I unsurprisingly dispute this labelling, both because something like the Laplace approximation requires extra-work on the MCMC output (and once done this work can lead to advanced Laplace methods like INLA) and because other methods could be considered as well (for instance, bridge sampling over several hypercubes). As shown in the recent paper by Mathieu Gerber and Nicolas Chopin (soon to be discussed at the RSS!), MCqMC has also become a feasible alternative that would compete well with the methods studied in this paper.
Overall, this is a paper that comes in a long list of papers on constant approximations. I do not find the Markov chain of MCMC aspect particularly compelling or specific, once the effective sample size is accounted for. It would be nice to find generic ways of optimising the visit to the hypercube ω and to estimate efficiently the weight of ω. The comparison is solely run over examples, but they all rely on a proper characterisation of the hypercube and the ability to simulate efficiently f over that hypercube.
Richard Wilkinson arXived a paper on accelerated ABC during MCMSki 4, paper that I almost missed when quickly perusing the daily list. This is another illustration of the “invasion of Gaussian processes” in ABC settings. Maybe under the influence of machine learning.
The paper starts with a link to the synthetic likelihood approximation of Wood (2010, Nature), as in Richard Everitt’s talk last week. Richard (W.) presents the generalised ABC as a kernel-based acceptance probability, using a kernel π(y|x), when y is the observed data and x=x(θ) the simulated one. He proposes a Gaussian process modelling for the log-likelihood (at the observed data y), with a quadratic (in θ) mean and Matérn covariance matrix. Hence the connection with Wood’s synthetic likelihood. Another connection is with Nicolas’ talk on QMC(MC): the θ’s are chosen following a Sobol sequence “in order to minimize the number of design points”. Which requires a reparameterisation to [0,1]p… I find this “uniform” exploration of the whole parameter space delicate to envision in complex parameter spaces and realistic problems, since the likelihood is highly concentrated on a tiny subregion of the original [0,1]p. Not mentioning the issue of the spurious mass on the boundaries of the hypercube possibly induced by the change of variable. The sequential algorithm of Richard also attempts at eliminating implausible zones of the parameter space. i.e. zones where the likelihood is essentially zero. My worries with this interesting notion are that (a) the early Gaussian process approximations may be poor and hence exclude zones they should not; (b) all Gaussian process approximations at all iterations must be saved; (c) the Sobol sequences apply to the whole [0,1]p at each iteration but the non-implausible region shrinks at each iteration, which induces a growing inefficiency in the algorithm. The Sobol sequence should be restricted to the previous non-implausible zone.
Overall, an interesting proposal that would need more prodding to understand whether or not it is robust to poor initialisation and complex structures. And a proposal belonging to the estimated likelihood branch of ABC, which makes use of the final Gaussian process approximation to run an MCM algorithm. Without returning to pseudo-data simulation, replacing it with log-likelihood simulation.
“These algorithms sample space randomly and naively and do not learn from previous simulations”
The above criticism is moderated in a footnote about ABC-SMC using the “current parameter value to determine which move to make next [but] parameters visited in previous iterations are not taken into account”. I still find it excessive in that SMC algorithms and in particular ABC-SMC algorithms are completely free to use the whole past to build the new proposal. This was clearly enunciated in our earlier population Monte Carlo papers. For instance, the complete collection of past particles can be recycled by weights computing thru our AMIS algorithm, as illustrated by Jukka Corander in one genetics application.
As astute ‘Og’s readers may have gathered (!), I am now in Annecy, Savoie, for the 9th IMACS seminar on Monte Carlo Methods. Where I was kindly invited to give a talk on ABC. IMACS stands for “International Association for Mathematics and Computers in Simulation” and the conference gathers themes and sensibilities I am not familiar with. And very few statisticians. For instance, I attended a stochastic particle session that had nothing to do with my understanding of particle systems (except for Pierre Del Moral’s mean field talk). The overall focus seems to stand much more around SDEs and quasi-Monte Carlo methods. Both items for which I have a genuine interest but little background, so I cannot report much on the talks I have attended beyond reporting their title. I for instance discovered the multilevel Monte Carlo techniques for SDEs, which sounds like a control variate methodology to reduce the variance w/o reducing the discretisation step. (Another instance is that the proceedings will be published in Mathematics and Computers in Simulation or Monte Carlo Methods and Applications. Two journals I have never published in.) Although I have yet a few months before attending my first MCQMC conference, I presume this is somehow a similar spirit and mix of communities.
At another level, attending a conference in Annecy is a blessing: the city is beautiful, the lake pristine and tantalising in the hot weather, and the surrounding mountains (we are actually quite close to Chamonix!) induce me to go running on both mornings and evenings.
Just received today the announcement for the next MCQMC meeting, which will be the 10th International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing (MCQMC 2012), taking place in Sydney from February 13 to 17, 2012.
MCQMC is a biennial conference devoted to Monte Carlo and quasi-Monte Carlo methods and their interactions and applications. (In brief, quasi-Monte Carlo methods replace the random choices that characterize the Monte Carlo method by well chosen deterministic choices.) For more information, click on the “Background” tab on the web site. This will be the first MCQMC conference to be held in the southern hemisphere. (Northerners may like to be reminded that February is summertime in Sydney!)
The invited plenary speakers for MCQMC are
* Pierre Del Moral (INRIA & University of Bordeaux 1, France)
* Mike Giles (University of Oxford, UK)
* Fred J. Hickernell (Illinois Institute of Technology, USA)
* Aicke Hinrichs (University of Jena, Germany)
* Michael Lacey (Georgia Institute of Technology, USA)
* Kerrie Mengersen (Queensland University of Technology, Australia)
* Andreas Neuenkirch (University of Kaiserslautern, Germany)
* Art B. Owen (Stanford University, USA)
* Leszek Plaskota (University of Warsaw, Poland)
* Eckhard Platen (University of Technology Sydney, Australia)
Proposals for special sessions should contain
* a short description of the theme and scope of the session
* the name(s) of the organizer(s)
* the names of four speakers (preferably from different institutions)
who have agreed to participate
The deadline for submitting proposals is July 1, 2011. There will be a later call for contributed talks.
Please join us for MCQMC