Archive for John Halton

Questions on ABC

Posted in Statistics, University life with tags , , , , , , on May 31, 2011 by xi'an

Our ABC survey for Statistics and Computing (and the ABC special issue!) has been quickly revised, resubmitted, and rearXived. Here is our conclusion about some issues that remain unsolved (much more limited in scope than the program drafted by Halton!):

  1. the convergence results obtained so far are unpractical in that they require either the tolerance to go to zero or the sample size to go to infinity. Obtaining exact error bounds for positive tolerances and finite sample sizes would bring a strong improvement in both the implementation of the method and in the assessment of its worth.
  2. in particular, the choice of the tolerance is so far handled from a very empirical perspective. Recent theoretical assessments show that a balance between Monte Carlo variability and target approximation is necessary, but the right amount of balance must be reached towards a practical implementation.
  3.  even though ABC is often presented as a converging method that approximates Bayesian inference, it can also be perceived as an inference technique per se and hence analysed in its own right. Connections with indirect inference have already been drawn, however the fine asymptotics of ABC would be most useful to derive. Moreover, it could indirectly provide indications about the optimal calibration of the algorithm.
  4. in connection with the above, the connection of ABC-based inference with other approximative methods like variational Bayes inference is so far unexplored. Comparing and interbreeding those different methods should become a research focus as well.
  5. the construction and selection of the summary statistics is so far highly empirical. An automated approach based on the principles of data analysis and approximate sufficiency would be much more attractive and convincing, especially in non-standard and complex settings. \item the debate about ABC-based model choice is so far inconclusive in that we cannot guarantee the validity of the approximation, while considering that a “large enough” collection of summary statistics provides an acceptable level of approximation. Evaluating the discrepancy by exploratory methods like the bootstrap would shed a much more satisfactory light on this issue.
  6.  the method necessarily faces limitations imposed by large datasets or complex models, in that simulating pseudo-data may itself become an impossible task. Dimension-reducing techniques that would simulate directly the summary statistics will soon become necessary.

A survey of [the 60’s] Monte Carlo methods

Posted in Books, R, Statistics, University life with tags , , , , , , , on May 17, 2011 by xi'an

“The only good Monte Carlos are the dead Monte Carlos” (Trotter and Tukey, quoted by Halton)

When I presented my [partial] history of MCM methods in Bristol two months ago, at the Julian Besag memorial, Christophe Andrieu mentioned a 1970 SIAM survey by John Halton on A retrospective and prospective survey of the Monte Carlo method. This is a huge paper (62 pages, 251 references) and it brings a useful light on the advances in the 60’s (the paper was written in 1968). From the reference list, it seems John Halton was planning two books on the Monte Carlo method, but a search on google did not show anything. I also discovered in this list that there was a 1954 RSS symposium (Read Paper?) on Monte Carlo methods. Note that there were at least two books on Monte Carlo published at the time, Hammersley and Handscomb’s 1964 Monte Carlo Methods and Scheider’s 1966 Monte Carlo Method. (Hammerlsey appears as a major figure in this survey.) There is a lot of material in this long review and most of the standard methods are listed: control variate, importance sampling, self-normalised simportance sampling, stratified sampling, antithetic variables, simulation by inversion, rejection or demarginalisation. Variance reduction is presented as the motivation for the alternative methods. Very little is said about the use of Monte Carlo methods in statistics (“many of  [the applications] are primitive and artless“)  I was first very surprised to find sequential Monte Carlomentioned as well, but it later appeared this was Monte Carlo methods for sequential problems, in the spirit of Abraham Wald. While the now forgotten EZH method is mentioned as a promising new method (p.11), the survey also contains an introduction to the conditional Monte Carlo method of Trotter and Tukey (1956) [from whom the above and rather puzzling quote is taken] that could relate to the averaging techniques of Kong, McCullagh, Meng, Nicolae and Tan as found in their 2003 Read Paper….

“The search for randomness is evidently futile” (Halton)

A large part of the review is taken by the study of uniform random generators and by the distinction between random, pseudo-random and quasi-random versions. Halton insists very much on the lack of justification in using non-random generators, even though they work well. He even goes further as to warn about bias because even the truly random generators are discrete. The book covers the pseudo-random generators, starting with the original version of von Neumann, Metropolis, and Ulam, continuing with Lehmer’s well-known congruencial generator, and the Fibonacci generalisation. For testing those generators by statistical tests (again with little theoretical ground), Marsaglia is mentioned.  The paper also covers in great detail the quasi-random sequences, covering low discrepancy requirements, van der Corput’s, Halton’s and Hammersley’s sequences. Halton considers quasi-Monte Carlo as “a branch of numerical analysis”.

The paper concludes with a list of 24 future developments I will cover in another post tomorrow…