Measuring statistical evidence using relative belief [book review]

“It is necessary to be vigilant to ensure that attempts to be mathematically general do not lead us to introduce absurdities into discussions of inference.” (p.8)

This new book by Michael Evans (Toronto) summarises his views on statistical evidence (expanded in a large number of papers), which are a quite unique mix of Bayesian  principles and less-Bayesian methodologies. I am quite glad I could receive a version of the book before it was published by CRC Press, thanks to Rob Carver (and Keith O’Rourke for warning me about it). [Warning: this is a rather long review and post, so readers may chose to opt out now!]

“The Bayes factor does not behave appropriately as a measure of belief, but it does behave appropriately as a measure of evidence.” (p.87)

full Bayesian significance test

Among the many comments (thanks!) I received when posting our Testing via mixture estimation paper came the suggestion to relate this approach to the notion of full Bayesian significance test (FBST) developed by (Julio, not Hal) Stern and Pereira, from São Paulo, Brazil. I thus had a look at this alternative and read the Bayesian Analysis paper they published in 2008, as well as a paper recently published in Logic Journal of IGPL. (I could not find what the IGPL stands for.) The central notion in these papers is the e-value, which provides the posterior probability that the posterior density is larger than the largest posterior density over the null set. This definition bothers me, first because the null set has a measure equal to zero under an absolutely continuous prior (BA, p.82). Hence the posterior density is defined in an arbitrary manner over the null set and the maximum is itself arbitrary. (An issue that invalidates my 1993 version of the Lindley-Jeffreys paradox!) And second because it considers the posterior probability of an event that does not exist a priori, being conditional on the data. This sounds in fact quite similar to Statistical Inference, Murray Aitkin’s (2009) book using a posterior distribution of the likelihood function. With the same drawback of using the data twice. And the other issues discussed in our commentary of the book. (As a side-much-on-the-side remark, the authors incidentally  forgot me when citing our 1992 Annals of Statistics paper about decision theory on accuracy estimators..!)

posterior likelihood ratio is back

“The PLR turns out to be a natural Bayesian measure of evidence of the studied hypotheses.”

Isabelle Smith and André Ferrari just arXived a paper on the posterior distribution of the likelihood ratio. This is in line with Murray Aitkin’s notion of considering the likelihood ratio

f(x|\theta_0) / f(x|\theta)

as a prior quantity, when contemplating the null hypothesis that θ is equal to θ0. (Also advanced by Alan Birnbaum and Arthur Dempster.) A concept we criticised (rather strongly) in our Statistics and Risk Modelling paper with Andrew Gelman and Judith Rousseau.  The arguments found in the current paper in defence of the posterior likelihood ratio are quite similar to Aitkin’s:

  • defined for (some) improper priors;
  • invariant under observation or parameter transforms;
  • more informative than tthe posterior mean of the posterior likelihood ratio, not-so-incidentally equal to the Bayes factor;
  • avoiding using the posterior mean for an asymmetric posterior distribution;
  • achieving some degree of reconciliation between Bayesian and frequentist perspectives, e.g. by being equal to some p-values;
  • easily computed by MCMC means (if need be).

One generalisation found in the paper handles the case of composite versus composite hypotheses, of the form

\int\mathbb{I}\left( p(x|\theta_1)<p(x|\theta_0)\right)\pi(\text{d}\theta_1|x)\pi(\text{d}\theta_0|x)

which brings back an earlier criticism I raised (in Edinburgh, at ICMS, where as one-of-those-coincidences, I read this paper!), namely that using the product of the marginals rather than the joint posterior is no more a standard Bayesian practice than using the data in a prior quantity. And leads to multiple uses of the data. Hence, having already delivered my perspective on this approach in the past, I do not feel the urge to “raise the flag” once again about a paper that is otherwise well-documented and mathematically rich.


re-read paper

Today, I attended the RSS Annual Conference in Newcastle-upon-Tyne. For one thing, I ran a Memorial session in memory of George Casella, with my (and his) friends Jim Hobert and Elias Moreno as speakers. (The session was well-attended if not overwhelmingly so.) For another thing, the RSS decided to have the DIC Read Paper by David Spiegelhalter, Nicky Best, Brad Carlin and Angelika van der Linde Bayesian measures of model complexity and fit re-Read, and I was asked to re-discuss the 2002 paper. Here are the slides of my discussion, borrowing from the 2006 Bayesian Analysis paper with Gilles Celeux, Florence Forbes, and Mike Titterington where we examined eight different versions of DIC for mixture models. (I refrained from using the title “snow white and the seven DICs” for a slide…) I also borrowed from our recent discussion of Murray Aitkin’s (2009) book. The other discussant was Elias Moreno, who focussed on consistency issues. (More on this and David Spiegelhalter’s defence in a few posts!) This was the first time I was giving a talk on a basketball court (I once gave an exam there!)

Do we need…yes we do (with some delay)!

Sometimes, if not that often, I forget about submitted papers to the point of thinking they are already accepted. This happened with the critical analysis of Murray Aitkin’s book Statistical Inference, already debated on the ‘Og, written with Andrew Gelman and Judith Rousseau, and resubmitted to Statistics and Risk Modeling in November…2011. As I had received a few months ago a response to our analysis from Murray, I was under the impression it was published or about to be published. Earlier this week I started looking for the reference in connection with the paper I was completing on the Jeffreys-Lindley paradox and could not find it. Checking emails on that topic I then discovered the latest one was from Novtember 2011 and the editor, when contacted, confirmed the paper was still under review! As it got accepted only a few hours later, my impression is that it had been misfiled and forgotten at some point, an impression reinforced by an earlier experience with the previous avatar of the journal, Statistics & Decisions. In the 1990’s George Casella and I had had a paper submitted to this journal for a while, which eventually got accepted. Then nothing happened for a year and more, until we contacted the editor who acknowledged the paper had been misfiled and forgotten! (This was before the electronic processing of papers, so it is quite plausible that the file corresponding to our accepted paper went under a drawer or into the wrong pile and that the editor was not keeping track of those accepted papers. After all, until Series B turned submission into an all-electronic experience, I was using a text file to keep track of daily submissions…) If you knew George, you can easily imagine his reaction when reading this reply… Anyway, all is well that ends well in that our review and Murray’s reply will appear in Statistics and Risk Modeling, hopefully in a reasonable delay.

inherent difficulties of non-Bayesian likelihood-based inference

Following a series of rejections of our discussion of Murray Aitkin’s book, Statistical Inference, discussion written with Andrew Gelman and Judith Rousseau, by the journals Bayesian Analysis, JASA (Book Reviews), and Electronic Journal of Statistics, we have received an encouraging review from the journal Statistics and Risk Modeling (with Applications on Finance and Insurance), formerly Statistics and Decisions. Since the main request was to broaden our perspective, we revised the paper towards a more global analysis of the issues raised by Murray’s book. For a start, the title got changed from the maybe provocative “Do we need an integrated Bayesian/likelihood inference?” into the slightly archaic “Inherent Difficulties of Non-Bayesian Likelihood-based Inference, as Revealed by an Examination of a Recent Book by Aitkin“. If only to explain why it is broader than a mere book review… For another, the paper also addresses similar criticisms to the deviance information criterion (DIC). Hopefully,  this revision will be considered more positively and turn into a discussion paper about this unBayesian use of Bayesian tools…

Do we need…not yet!

Following rejections of our discussion paper of Murray Aitkin’s book, Statistical Inference, written with Andrew Gelman and Judith Rousseau, by the journals Bayesian Analysis [where I think it truly belonged, being more than a book review, an assessment of the relevance of the approach from a Bayesian viewpoint!], JASA Book Reviews, and Electronic Journal of Statistics, we have decided to try yet another outlet for our discussion, Statistics and Decisions, to which I had not submitted a paper in about twenty years (since the loss of an accepted paper with George Casella by the S&D editor at the time!). More fundamentally, I completely understand and acknowledge the individual decision by each editorial board not to publish our piece in their respective journals, but I bemoan (once again) the lack of outlet for this type of opinion tribune that should appeal to the community as a whole (again, because this is a book that aims at a complete shift in or out of the  Bayesian theory!) and that should be possible given the current electronic communication tools. In other and more precise words, journals should start blogs or forums where readers could comment on published papers and, why not?!, rejected authors could respond to reviews… This is why I liked the format of the review process in the journal Hydrology and Earth System Sciences. that allows for a publication of referee’ reports and comments from the readership. In any case, I hope Statistics and Decisions will be interested in our piece as we are about to run out of options and stamina! (I usually give up much earlier than that!)