## A survey of [the 60’s] Monte Carlo methods

Posted in Books, R, Statistics, University life with tags , , , , , , , on May 17, 2011 by xi'an

“The only good Monte Carlos are the dead Monte Carlos” (Trotter and Tukey, quoted by Halton)

When I presented my [partial] history of MCM methods in Bristol two months ago, at the Julian Besag memorial, Christophe Andrieu mentioned a 1970 SIAM survey by John Halton on A retrospective and prospective survey of the Monte Carlo method. This is a huge paper (62 pages, 251 references) and it brings a useful light on the advances in the 60’s (the paper was written in 1968). From the reference list, it seems John Halton was planning two books on the Monte Carlo method, but a search on google did not show anything. I also discovered in this list that there was a 1954 RSS symposium (Read Paper?) on Monte Carlo methods. Note that there were at least two books on Monte Carlo published at the time, Hammersley and Handscomb’s 1964 Monte Carlo Methods and Scheider’s 1966 Monte Carlo Method. (Hammerlsey appears as a major figure in this survey.) There is a lot of material in this long review and most of the standard methods are listed: control variate, importance sampling, self-normalised simportance sampling, stratified sampling, antithetic variables, simulation by inversion, rejection or demarginalisation. Variance reduction is presented as the motivation for the alternative methods. Very little is said about the use of Monte Carlo methods in statistics (“many of  [the applications] are primitive and artless“)  I was first very surprised to find sequential Monte Carlomentioned as well, but it later appeared this was Monte Carlo methods for sequential problems, in the spirit of Abraham Wald. While the now forgotten EZH method is mentioned as a promising new method (p.11), the survey also contains an introduction to the conditional Monte Carlo method of Trotter and Tukey (1956) [from whom the above and rather puzzling quote is taken] that could relate to the averaging techniques of Kong, McCullagh, Meng, Nicolae and Tan as found in their 2003 Read Paper….

“The search for randomness is evidently futile” (Halton)

A large part of the review is taken by the study of uniform random generators and by the distinction between random, pseudo-random and quasi-random versions. Halton insists very much on the lack of justification in using non-random generators, even though they work well. He even goes further as to warn about bias because even the truly random generators are discrete. The book covers the pseudo-random generators, starting with the original version of von Neumann, Metropolis, and Ulam, continuing with Lehmer’s well-known congruencial generator, and the Fibonacci generalisation. For testing those generators by statistical tests (again with little theoretical ground), Marsaglia is mentioned.  The paper also covers in great detail the quasi-random sequences, covering low discrepancy requirements, van der Corput’s, Halton’s and Hammersley’s sequences. Halton considers quasi-Monte Carlo as “a branch of numerical analysis”.

The paper concludes with a list of 24 future developments I will cover in another post tomorrow…

## MCMC with errors

Posted in R, Statistics, University life with tags , , , , , , , on March 25, 2011 by xi'an

I received this email last week from Ian Langmore, a postdoc in Columbia:

I’m looking for literature on a subject and can’t find it:  I have a Metropolis sampler where the acceptance probability is evaluated with some error.  This error is not simply error in evaluation of the target density.  It occurs due to the method by which we approximate the acceptance probability.

This is a sensible question, albeit a wee vague… The closest item of work I can think of is the recent paper by Christophe Andrieu and Gareth Roberts,  in the Annals of Statistics (2009) following an original proposal by Marc Beaumont. I think there is an early 1990’s paper by Gareth and Jeff Rosenthal where they consider the impact of some approximation effect like real number representation on the convergence but I cannot find it. Of course, the recent particle MCMC JRSS B discussion paper by Christophe,  Arnaud Doucet and Roman Hollenstein is a way to bypass the problem. (In a sense ABC is a rudimentary answer as well.) And there must be many other papers on this topic I am not aware of….

## Vanilla on-line

Posted in Statistics, University life with tags , , , , on February 18, 2011 by xi'an

The Vanilla Rao–Blackwellization of Metropolis–Hastings algorithms paper with Randal Douc is now published in Annals of Statistics (Volume 39, Number 1 (2011), pages 261-277) and available on-line via the project Euclid. We are currently working with Pierre Jacob on an extension of this idea towards parallelisation

## Poster at MCMSki III

Posted in R, Statistics, Travel with tags , , , , , , , , on December 28, 2010 by xi'an

Here is the poster presented at MCMSki III next week by Pierre Jacob about our joint paper on parallelisation:

## Robust adaptive Metropolis algorithm [arXiv:10114381]

Posted in R, Statistics with tags , , , , , , on November 23, 2010 by xi'an

Matti Vihola has posted a new paper on arXiv about adaptive (random walk) Metropolis-Hastings algorithms. The update in the (lower diagonal) scale matrix is

$S_nS_n^\text{T}=S_{n-1}\left(\mathbf{I}_d-\eta_n[\alpha_n-\alpha^\star]\dfrac{U_nU_n^\text{T}}{||U_n||^2}\right)S^\text{T}_{n-1}$

where

• $\alpha_n$ is the current acceptance probability and $\alpha^\star$ the target acceptance rate;
• $U_n$ is the current random noise for the proposal, $Y_n=X_{n-1}+S_{n-1}U_n$;
• $\eta_n$ is a step size sequence decaying to zero.

The spirit of the adaptation is therefore a Robbins-Monro type adaptation of the covariance matrix in the random walk, with a target acceptance rate. It follows the lines Christophe Andrieu and I had drafted in our [most famous!] unpublished paper, Controlled MCMC for optimal sampling. The current paper shows that the fixed point for $S_n$ is proportional to the scale of the target if the latter is elliptically symmetric (but does not establish a sufficient condition for convergence). It concludes with a Law of Large Numbers for the empirical average of the $f(X_n)$ under rather strong assumptions (on f, the target, and the matrices $S_n$). The simulations run on formalised examples show a clear improvement over the existing adaptive algorithms (see above) and the method is implemented within Matti Vihola’s Grapham software. I presume Matti will present this latest work during his invited talk at Adap’skiii.

Ps-Took me at least 15 minutes to spot the error in the above LaTeX formula, ending up with S^\text{T}_{n−1}: Copy-pasting from the pdf file had produced an unconventional minus sign in n−1 that was impossible to spot!