## proper likelihoods for Bayesian analysis

Posted in Books, Statistics, University life with tags , , , , , , , on April 11, 2013 by xi'an

While in Montpellier yesterday (where I also had the opportunity of tasting an excellent local wine!), I had a look at the 1992 Biometrika paper by Monahan and Boos on “Proper likelihoods for Bayesian analysis“. This is a paper I missed and that was pointed out to me during the discussions in Padova. The main point of this short paper is to decide when a method based on an approximative likelihood function is truly (or properly) Bayes. Just the very question a bystander would ask of ABC methods, wouldn’t it?! The validation proposed by Monahan and Boos is one of calibration of credible sets, just as in the recent arXiv paper of Dennis Prangle, Michael Blum, G. Popovic and Scott Sisson I reviewed three months ago. The idea is indeed to check by simulation that the true posterior coverage of an α-level set equals the nominal coverage α. In other words, the predictive based on the likelihood approximation should be uniformly distributed and this leads to a goodness-of-fit test based on simulations. As in our ABC model choice paper, Proper likelihoods for Bayesian analysis notices that Bayesian inference drawn upon an insufficient statistic is proper and valid, simply less accurate than the Bayesian inference drawn upon the whole dataset. The paper also enounces a conjecture:

A [approximate] likelihood L is a coverage proper Bayesian likelihood if and inly if L has the form L(y|θ) = c(s) g(s|θ) where s=S(y) is a statistic with density g(s|θ) and c(s) some function depending on s alone.

conjecture that sounds incorrect in that noisy ABC is also well-calibrated. (I am not 100% sure of this argument, though.) An interesting section covers the case of pivotal densities as substitute likelihoods and of the confusion created by the double meaning of the parameter θ. The last section is also connected with ABC in that Monahan and Boos reflect on the use of large sample approximations, like normal distributions for estimates of θ which are a special kind of statistics, but do not report formal results on the asymptotic validation of such approximations. All in all, a fairly interesting paper!

Reading this highly interesting paper also made me realise that the criticism I had made in my review of Prangle et al. about the difficulty for this calibration method to address the issue of summary statistics was incorrect: when using the true likelihood function, the use of an arbitrary summary statistics is validated by this method and is thus proper.

Posted in Statistics, University life with tags , , , , , , , , , , , , on March 25, 2013 by xi'an

Here are the slides of my talk in Padova for the workshop Recent Advances in statistical inference: theory and case studies (very similar to the slides for the Varanasi and Gainesville meetings, obviously!, with Peter Müller commenting [at last!] that I had picked the wrong photos from Khajuraho!)

and Francesco Pauli, from Trieste, whose slides are:

These were kind and rich discussions with many interesting openings: Stefano’s idea of estimating the pivotal function h is opening new directions, obviously, as it indicates an additional degree of freedom in calibrating the method. Esp. when considering the high variability of the empirical likelihood fit depending on the the function h. For instance, one could start with a large collection of candidate functions and build a regression or a principal component reparameterisation from this collection… (Actually I did not get point #1 about ignoring f: the empirical likelihood is by essence ignoring anything outside the identifying equation, so as long as the equation is valid..) Point #2: Opposing sample free and simulation free techniques is another interesting venue, although I would not say ABC is “sample free”. As to point #3, I will certainly get a look at Monahan and Boos (1992) to see if this can drive the choice of a specific type of pseudo-likelihoods. I like the idea of checking the “coverage of posterior sets” and even more “the likelihood must be the density of a statistic, not necessarily sufficient” as it obviously relates with our current ABC model comparison work… Esp. when the very same paper is mentioned by Francesco as well. Grazie, Stefano! I also appreciate the survey made by Francesco on the consistency conditions, because I think this is an important issue that should be taken into consideration when designing ABC algorithms. (Just pointing out again that, in the theorem of Fearnhead and Prangle (2012) quoting Bernardo and Smith (1992), some conditions are missing for the mathematical consistency to apply.) I also like the agreement we seem to reach about ABC being evaluated per se rather than an a poor man’s Bayesian method. Francesco’s analysis of Monahan and Boos (1992) as validating or not empirical likelihood points out a possible link with the recent coverage analysis of Prangle et al., discussed on the ‘Og a few weeks ago. And an unsuspected link with Larry Wasserman! Grazie, Francesco!

## Biometrika, volume 100

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , on March 5, 2013 by xi'an

I had been privileged to have a look at a preliminary version of the now-published retrospective written by Mike Titterington on the 100 first issues of Biometrika (more exactly, “from volume 28 onwards“, as the title state). Mike was the dedicated editor of Biometrika for many years and edited a nice book for the 100th anniversary of the journal. He started from the 100th most highly cited papers within the journal to build a coherent chronological coverage. From a Bayesian perspective, this retrospective starts with Maurice Kendall trying to reconcile frequentists and non-frequentists in 1949, while having a hard time with fiducial statistics. Then Dennis Lindley makes it to the top 100 in 1957 with the Lindley-Jeffreys paradox. From 1958 till 1961, Darroch is quoted several times for his (fine) formalisation of the capture-recapture experiments we were to study much later (Biometrika, 1992) with Ed George… In the 1960’s, Bayesian papers became more visible, including Don Fraser (1961) and Arthur Dempster’ Demspter-Shafer theory of evidence, as well as George Box and co-authors (1965, 1968) and Arnold Zellner (1964). Keith Hastings’ 1970 paper stands as the fifth most highly cited paper, even though it was ignored for almost two decades. The number of Bayesian papers kept increasing. including Binder’s (1978) cluster estimation, Efron and Morris’ (1972) James-Stein estimators, and Efron and Thisted’s (1978) terrific evaluation of Shakespeare’s vocabulary. From then, the number of Bayesian papers gets too large to cover in its entirety. The 1980’s saw papers by Julian Besag (1977, 1989, 1989 with Peter Clifford, which was yet another precursor MCMC) and Luke Tierney’s work (1989) on Laplace approximation. Carter and Kohn’s (1994) MCMC algorithm on state space models made it to the top 40, while Peter Green’s (1995) reversible jump algorithm came close to Hastings’ (1970) record, being the 8th most highly cited paper. Since the more recent papers do not make it to the top 100 list, Mike Titterington’s coverage gets more exhaustive as the years draw near, with an almost complete coverage for the final years. Overall, a fascinating journey through the years and the reasons why Biometrika is such a great journal and constantly so.

Posted in Books, Statistics, University life with tags , , , , , , , , , on December 14, 2012 by xi'an

This week, my student Dona Skanji gave a presentation of the paper of Hastings “Monte Carlo sampling methods using Markov chains and their applications“, which set the rules for running MCMC algorithms, much more so than the original paper by Metropolis et al. which presented an optimisation device. even though the latter clearly stated the Markovian principle of those algorithms and their use for integration. (This is definitely a classic, selected in the book Biometrika: One hundred years, by Mike Titterington and David Cox.) Here are her slides (the best Beamer slides so far!):

Given that I had already taught my lectures on Markov chains and on MCMC algorithms, the preliminary part of Dona’s talk was easier to compose and understanding the principles of the method was certainly more straightforward than for the other papers in the series. I think she nonetheless did a rather good job in summing up the paper, running this extra simulation for the Poisson distribution—with the interesting “mistake” of including the burnin time in the representation of the output and concluding about a poor convergence—and mentioning the Gibbs extension.I led the discussion of the seminar towards irreducibility conditions and Peskun’s ordering of Markov chains, which maybe could have been mentioned by Dona since she was aware Peskun was Hastings‘ student.

## Handbook of Markov chain Monte Carlo

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , , , on September 22, 2011 by xi'an

At JSM, John Kimmel gave me a copy of the Handbook of Markov chain Monte Carlo, as I had not (yet?!) received it. This handbook is edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, all first-class jedis of the MCMC galaxy. I had not had a chance to get a look at the book until now as Jean-Michel Marin took it home for me from Miami, but, as he remarked in giving it back to me last week, the outcome truly is excellent! Of course, authors and editors being friends of mine, the reader may worry about the objectivity of this assessment; however the quality of the contents is clearly there and the book appears as a worthy successor to the tremendous Markov chain Monte Carlo in Practice by Wally Gilks, Sylvia Richardson and David Spiegelhalter. (I can attest to the involvement of the editors from the many rounds of reviews we exchanged about our MCMC history chapter!) The style of the chapters is rather homogeneous and there are a few R codes here and there. So, while I will still stick to our Monte Carlo Statistical Methods book for teaching MCMC to my graduate students next month, I think the book can well be used at a teaching level as well as a reference on the state-of-the-art MCMC technology. Continue reading

## Don Fraser’s rejoinder

Posted in Books, Statistics, University life with tags , , , , , , on August 24, 2011 by xi'an

“How can a discipline, central to science and to critical thinking, have two methodologies, two logics, two approaches that frequently give substantially different answers to the same problems. Any astute person from outside would say “Why don’t they put their house in order?”” Don Fraser

Following the discussions of his Statistical Science paper Is Bayes posterior just quick and dirty confidence?, by Kesar Singh and Minge Xie, Larry Wasserman (who coined the neologism Frasian for the occasion), Tong Zhang, and myself, Don Fraser has written his rejoinder to the discussion (although in Biometrika style it is for Statistical Science!). His conclusion that “no one argued that the use of the conditional probability lemma with an imaginary input had powers beyond confidence, supernatural powers” is difficult to escape, as I would not dream of promoting a super-Bayes jumping to the rescue of bystanders misled by evil frequentists!!! More seriously, this rejoinder makes me reflect on lectures from the past years, from those on the diverse notions of probability (Jeffreys, Keynes, von Mises, and Burdzy) to those on scientific discovery (mostly Seber‘s, and the promising Error and Inference by Mayo and Spanos I just received).

## Colloquium for Mike Titterington

Posted in Statistics, Travel, University life with tags , , , , , , , , on June 3, 2011 by xi'an

The colloquium held today at Glasgow University in honour of Mike Titterington for his retiral was highly enjoyable! First, it was a pleasure to celebrate Mike’s achievements at this (early) stage of his career, along with people from Glasgow but also from all over the UK and even from Australia, among whom a lot of friends. Second, the (other) talks were highly interesting, with Peter Hall talking about the asymptotics of records, Byron Morgan about identifiability in capture-recapture models, Peter Green presenting a graphical diagnostic for spotting divergence between prior and likelihood in multivariate models, and Adrian Bowman illustrating advanced face analysis using principal curves on lips and faces. Third, I got a fair amount of questions and comments about ABC in general and ABC model choice in particular, including David Cox commenting that ABC was an important new topic and suggesting using goodness-of-fit tools for model comparison. The symposium per se ended up with a specially designed cake covering (in sugar!) some of Mike’s academic endeavours during the past years. While a formal affair for which I had to run to get a shirt, the diner was equally enjoyable, including a simultaneously witty and deep after-dinner talk paying tribute to Mike’s contributions by David Cox (who was Mike’s predecessor as editor of Biometrika) and a funny conclusion by John McColl who dug out a 1976 probability assignment he had from Mike that was the Monty Hall problem.

The next celebration of that kind I am taking part in is Hans Künsch’s 60th birthday in Zürich next October. Looking forward to it!