Anoop Korattikara, Yutian Chen and Max Welling recently arXived a paper on the appeal of using only part of the data to speed up MCMC. This is different from the growing literature on unbiased estimators of the likelihood exemplified by Andrieu & Roberts (2009). Here, the approximation to the true target is akin to the approximation in ABC algorithms in that a value of the parameter is accepted if the difference in the likelihoods is larger than a given bound. Expressing this perspective as a test on the mean of the log likelihood leads the authors to use instead a subsample from the whole sample. (The approximation level ε is then a bound on the p-value.) While this idea only applies to iid settings, it is quite interesting and sounds a wee bit like a bootstrapped version of MCMC. Especially since it sounds as if it could provide an auto-evaluation of its error.
Archive for MCMC
More exciting news about MCMSki IV!
First thing first, the 16 contributed sessions are now all-set, having gotten the stamp of approval from the scientific committee! Thanks to everyone who submitted a session proposal. (There were so many proposals that we alas had to reject some, as well as every single talk proposal… Sorry people: we hope to hear about your research advances via your posters!) See the MCMSki IV website for the whole list. Apart from the plenary lectures, and the round table on software held on the second evening, there will be three parallel sessions on the remaining three slots for each day of the conference, which means 25 sessions total!
Second, the “call for posters” is open, simply meaning that anyone wishing to present a poster at MCMSki IV on Monday evening (or Tuesday night if we cannot accommodate all posters within a single evening!) is welcome to do so! This will take place in the conference centre as well (with an open bar to keep up with traditions) To this effect, if you intend to present a poster, (a) tick the box in the registration form and (b) …wait for further instructions on the MCMSki IV website about sending your abstract as we are trying to find an easy way to store and publish posters there. Simple as AB(C)!
Last, the registration page is now open! So fell free to register at your earliest convenience. The deadline for early bird registration is October 15, 2013 however hotel rooms are likely to vanish much earlier than that, leaving you on your own to find accommodation in Chamonix (not such a terrible task, actually!)
I had been privileged to have a look at a preliminary version of the now-published retrospective written by Mike Titterington on the 100 first issues of Biometrika (more exactly, “from volume 28 onwards“, as the title state). Mike was the dedicated editor of Biometrika for many years and edited a nice book for the 100th anniversary of the journal. He started from the 100th most highly cited papers within the journal to build a coherent chronological coverage. From a Bayesian perspective, this retrospective starts with Maurice Kendall trying to reconcile frequentists and non-frequentists in 1949, while having a hard time with fiducial statistics. Then Dennis Lindley makes it to the top 100 in 1957 with the Lindley-Jeffreys paradox. From 1958 till 1961, Darroch is quoted several times for his (fine) formalisation of the capture-recapture experiments we were to study much later (Biometrika, 1992) with Ed George… In the 1960′s, Bayesian papers became more visible, including Don Fraser (1961) and Arthur Dempster’ Demspter-Shafer theory of evidence, as well as George Box and co-authors (1965, 1968) and Arnold Zellner (1964). Keith Hastings’ 1970 paper stands as the fifth most highly cited paper, even though it was ignored for almost two decades. The number of Bayesian papers kept increasing. including Binder’s (1978) cluster estimation, Efron and Morris’ (1972) James-Stein estimators, and Efron and Thisted’s (1978) terrific evaluation of Shakespeare’s vocabulary. From then, the number of Bayesian papers gets too large to cover in its entirety. The 1980′s saw papers by Julian Besag (1977, 1989, 1989 with Peter Clifford, which was yet another precursor MCMC) and Luke Tierney’s work (1989) on Laplace approximation. Carter and Kohn’s (1994) MCMC algorithm on state space models made it to the top 40, while Peter Green’s (1995) reversible jump algorithm came close to Hastings’ (1970) record, being the 8th most highly cited paper. Since the more recent papers do not make it to the top 100 list, Mike Titterington’s coverage gets more exhaustive as the years draw near, with an almost complete coverage for the final years. Overall, a fascinating journey through the years and the reasons why Biometrika is such a great journal and constantly so.
While a random walk Metropolis-Hastings algorithm cannot be uniformly ergodic in a general setting (Mengersen and Tweedie, AoS, 1996), because it needs more energy to leave far away starting points, it can be geometrically ergodic depending on the target (and the proposal). In a recent Annals of Statistics paper, Leif Johnson and Charlie Geyer designed a trick to turn a random walk Metropolis-Hastings algorithm into a geometrically ergodic random walk Metropolis-Hastings algorithm by virtue of an isotropic transform (under the provision that the original target density has a moment generating function). This theoretical result is complemented by an R package called mcmc. (I have not tested it so far, having read the paper in the métro.) The examples included in the paper are however fairly academic and I wonder how the method performs in practice, on truly complex models, in particular because the change of variables relies on (a) an origin and (b) changing the curvature of space uniformly in all dimensions. Nonetheless, the idea is attractive and reminds me of a project of ours with Randal Douc, started thanks to the ‘Og and still under completion.
So, after one week of travelling around India and posting only pictures, a post about the initial reason I came to India! The meeting started on Monday, yesterday, at the Banaras Hindu University, in Varanasi. I was first amazed by the large number of participants, around 350, until I realised there were more than 50 students in the BHU Stat department alone. The opening ceremony was more formal than usual, with many welcoming talks, and even had a religious component with songs and flower necklaces around the bust of the University founder. After this ceremony, Jim Berger gave a general public talk on the dangers of p-values and multiple testing, worth repeating on a regular basis. Then Nozer Singpurwalla presented a foundational lecture aiming at replacing probability by prevalence in reliability, lecture that would certainly appeal to Krzysztof Burdzy as it mostly dealt with the early works on the formalisation of probability. I had to skip John Geweke’s talk on fast Monte Carlo methods, alas, as I needed to go and buy a down jacket to fight the so-unusual cold wave over Northern India in general and Varanasi in particular, where heating is unheard of… Today, I mostly attended MCMC-related talks, including a presentation by Vivek Roy of the technique he had discussed with me two months ago in Ames. The idea is quite interesting if maybe impractical: the ergodic theorem does not require the stationary measure to be proper for averages to converge (provided the function is integrable). Thus one can run a Markov chain to approximate integrals against an improper measure that is the stationary measure of this chain. I alas missed most of Adam Johansen’s talk on the Rao-Blackwellisation of Monte Carlo, as I could not but doze, thanks to a sleepless night fighting both the cold and internal disruptions… The day also saw interesting plenary sessions by Tony O’Hagan on computer experiments (with the obligated barb on Objective Bayes), Jayanta Ghosh on clustering as a non-parametric method (which made me ponder whether a Dirichlet process version of the empirical likelihood approximation was available), and Robert Kohn on upper bounds on the inefficiency of an unbiased estimator of the target distribution.
(No picture of Varanasi today, as my [new] hotel wireless does not like transfers!) Here are the slides of my talk tomorrow, rewriting the Bristol talk with emphasis on empirical likelihood: