## JSM [4]

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , on August 3, 2011 by xi'an

A new day at JSM 2011, admittedly not as tense as Monday, but still full. After a long run in the early hours when I took this picture, I started the day with the Controversies in the philosophy of Bayesian statistics with Jim Berger and Andrew Gelman, Rob Kass and Cosma Shalizi being unable to make it. From my point of view it was a fun session, even though I wish I had been more incisive! But I agreed with most of Jim said, so… It is too bad we could not cover his last point about the Bayesian procedures that were not Bayesianly justified (like posterior predictives) as I was quite interested in the potential discussion in this matter (incl. the position of the room on ABC!). Anyway, I am quite thankful to Andrew for setting up this session.As Jum said, we should have those more often, especially when the attendance was large enough to fill a double room at 8:30am.

Incidentally, I managed to have a glaring typo in my slides, pointed out by Susie Bayarri: Bayes theorem was written as

$\pi(\theta) \propto \pi(\theta) f(x|\theta)$

Aie, aie, aie! Short of better scapegoats, I will blame the AF plane for this… (This was a good way to start a controversy, however no one raised to the bait!) A more serious question reminded me of the debate surrounding A Search for Certainty: It was whether frequentist and subjective Bayes approaches had more justifications than the objective Bayes approach, in the light of von Mises‘ and personalistic (read, de Finetti) interpretations of probability.

While there were many possible alternatives for the next session, I went to attend Sylvia Richardson’s Medallion Lecture. This made sense on many levels, the primary one being that Sylvia and I worked and are working on rather close topics, from mixtures of distributions, to variable selection, to ABC. So I was looking forward the global picture she would provide on those topics. I particularly enjoyed the way she linked mixtures with more general modelling structures, through extensions in the distribution of the latent variables. (This is also why I am attending Chris Holmes’ Memorial Lecture tomorrow, with the exciting title of Loss, Actions, Decisions: Bayesian Analysis in High-Throughput Genomics.)

In the afternoon, I only attended one talk by David Nott, Efficient MCMC Schemes for Computationally Expensive Posterior Distribution, which involved hybrid Monte Carlo on complex likelihoods. This was quite interesting, as hybrid Monte Carlo is indeed the solution to diminish the number of likelihood evaluations, since it moves along iso-density slices… After this, we went working on ABC model choice with Jean-Michel Marin and Natesh Pillai. Before joining the fun at the Section for Bayesian statistical mixer, where the Savage and Mitchell and student awards were presented. This was the opportunity to see friends, meet new Bayesians, and congratulate the winners, including Julien Cornebise and Robin Ryder of course.

## About Fig. 4 of Fagundes et al. (2007)

Posted in R, Statistics, University life with tags , , , , , , , , on July 13, 2011 by xi'an

Yesterday, we had a meeting of our EMILE network on statistics for population genetics (in Montpellier) and we were discussing our respective recent advances in ABC model choice. One of our colleagues mentioned the constant request (from referees) to include the post-ABC processing devised by Fagundes et al. in their 2007 ABC paper. (This paper contains a wealth of statistical innovations, but I only focus here on this post-checking device.)

The method centres around the above figure, with the attached caption

Fig. 4. Empirical distributions of the estimated relative probabilities of the AFREG model when the AFREG (solid line), MREBIG (dashed line), and ASEG (dotted line) models are the true models. Here, we simulated 1,000 data sets under the AFREG, MREBIG, and ASEG models by drawing random parameter values from the priors. The density estimates of the three models at the AFREG posterior probability = 0.781 (vertical line) were used to compute the probability that AFREG is the correct model given our observation that PAFREG = 0.781. This probability is equal to 0.817.

which aims at computing a p-value based on the ABC estimate of the posterior probability of a model.

I am somehow uncertain about the added value of this computation and about the paradox of the sentence “the probability that AFREG is the correct model [given] the AFREG posterior probability (..) is equal to 0.817″… If I understand correctly the approach followed by Fagundes et al., they simulate samples from the joint distribution over parameter and (pseudo-)data conditional on each model, then approximate the density of the [ABC estimated] posterior probabilities of the AFREG model by a non parametric density estimate, presumably density(), which means in Bayesian terms the marginal likelihoods (or evidences) of the posterior probability of  the AFREG model under each of the models under comparison. The “probability that AFREG is the correct model given our observation that PAFREG = 0.781″ is then completely correct in the sense that it is truly a posterior probability for this model based on the sole observation of the transform (or statistic) of the data x equal to PAFREG(x). However, if we only look at the Bayesian perspective and do not consider the computational aspects, there is no rationale in moving from the data (or from the summary statistics) to a single statistic equal to PAFREG(x), as this induces a loss of information. (Furthermore, it seems to me that the answer is not invariant against the choice of the model whose posterior probability is computed, if more than two models are compared. In other words, the posterior probability of the AFREG model given the sole observation of PAFREG(x). is not necessarily the same as the posterior probability of the AFREG model given the sole observation of PASEG(x)…) Although this is not at all advised by the paper, it seems to me that some users of this processing opt instead for simulations of the parameter taken from the ABC posterior, which amounts to using the “data twice“, i.e. the squared likelihood instead of the likelihood…  So, while the procedure is formally correct (despite Templeton’s arguments against it), it has no added value. Obviously, one could alternatively argue that the computational precision in approximating the marginal likelihoods is higher with the (non-parametric) solution based on PAFREG(x) than the (ABC) solution based on x, but this is yet to be demonstrated (and weighted against the information loss).

Just as a side remark on the polychotomous logistic regression approximation to the posterior probabilities introduced in Fagundes et al.: the idea is quite enticing, as a statistical regularisation of ABC simulations. It could be exploited further by using a standard model selection strategy in order to pick the summary statistics that are truly contributed to explain the model index.

## Bayesian model selection

Posted in Books, R, Statistics with tags , , , , , , , , , on December 8, 2010 by xi'an

Last week, I received a box of books from the International Statistical Review, for reviewing them. I thus grabbed the one whose title was most appealing to me, namely Bayesian Model Selection and Statistical Modeling by Tomohiro Ando. I am indeed interested in both the nature of testing hypotheses or more accurately of assessing models, as discussed in both my talk at the Seminar of philosophy of mathematics at Université Paris Diderot a few days ago and the post on Murray Aitkin’s alternative, and the computational aspects of the resulting Bayesian procedures, including evidence, the Savage-Dickey paradox, nested sampling, harmonic mean estimators, and more…

After reading through the book, I am alas rather disappointed. What I consider to be innovative or at least “novel” parts with comparison with existing books (like Chen, Shao and Ibrahim, 2000, which remains a reference on this topic) is based on papers written by the author over the past five years and it is mostly a sort of asymptotic Bayes analysis that I do not see as particularly Bayesian, because involving the “true” distribution of the data. The coverage of the existing literature on Bayesian model choice is often incomplete and sometimes misses the point, as discussed below. This is especially true for the computational aspects that are generally mistreated or at least not treated in a way from which a newcomer to the field would benefit. The author often takes complex econometric examples for illustration, which is nice; however, he does not pursue the details far enough for the reader to be able to replicate the study without further reading. (An example is given by the coverage of stochastic volatility in Section 4.5.1, pages 83-84.) The few exercises at the end of each chapter are rather unhelpful, often sounding rather like notes than true problems (an extreme case is Exercise 6 pages 196-197 which introduces the Metropolis-Hastings algorithm within the exercise (although it has already been defined on pages 66-67) and then asks to derive the marginal likelihood estimator. Another such exercise on page 164-165 introduces the theory of DNA microarrays and gene expression in ten lines (which are later repeated verbatim on page 227), then asks to identify marker genes responsible for a certain trait.) The overall feeling after reading this book is thus that the contribution to the field of Bayesian Model Selection and Statistical Modeling is too limited and disorganised for the book to be recommended as “helping you choose the right Bayesian model” (backcover).