I have just posted on arXiv the fourth (and hopefully final) version of our paper, Relevant statistics for Bayesian model choice, written jointly with Jean-Michel Marin, Natesh Pillai, and Judith Rousseau over the past two years. As we received a very positive return from the editorial team at JRSS Series B, I flew to Montpellier today to write & resubmit a revised version of the paper. The changes are only stylistic, since we could not answer in depth a query about the apparently different speeds of convergence of the posterior probabilities under the Gaussian and Laplace distributions in Figures 3 & 4 (see paper). This was a most interesting question in that the marginal likelihoods do indeed seem to converge at different speeds. However, the only precise information we can derive from our result (Theorem 1) is when the Bayes factor is not consistent. Otherwise, we only have a lower bound on its speed of convergence (under the correct model). Getting precise speeds in this case sounds beyond our reach… (Unless I am confused with time zones, this post should come alive just after the fourth version is announced on arXiv..)
Archive for JRSSB
This week, freshly back from Roma, I got the reviews on our paper “Relevant statistics for Bayesian model choice” from Series B. The comments are detailed and mostly to the point, expressing concern about the relevance of the paper for statistical methodology as the major issue. We are thus asked for a revision making a much better connection with ABC methodology.
This is not an unexpected outcome, from my point of view, because the paper is indeed quite theoretical and the mathematical assumptions required to obtain the convergence theorems are rather overwhelming… Meaning that in practical cases they cannot truly be checked. However, I think we can eventually address those concerns for two distinct reasons: first, the paper comes as a third step in a series of papers where we first identified a sufficiency property, then realised that this property was actually quite a rare occurrence, and finally made a theoretical advance as to when is a summary statistic enough (i.e. “sufficient” in the standard sense of the term!) to conduct model choice, with a clear answer that the mean ranges of the summary statistic under each model could not intersect. Second, my own personal view is that those assumptions needed for convergence are not of the highest importance for statistical practice (even though they are needed in the paper!) and thus that, from a methodological point of view, only the conclusion should be taken into account. It is then rather straightforward to come up with (quick-and-dirty) simulation devices to check whether a summary statistic behaves differently under both models, taking advantage of the reference table already available (instead of having to run Monte Carlo experiments with ABC basis)…
One of the comments was that maybe Bayes factors were not appropriate for conducting model choice, thus making the whole derivation irrelevant. This is a possible perspective but it can be objected that Bayes factors and posterior probabilities are used in conjunction with ABC in dozens of genetic papers. Further arguments are provided in the various replies to both of Templeton’s radical criticisms. That more empirical and model-based assessments also are available is quite correct, as demonstrated in the multicriterion approach of Olli Ratmann and co-authors. This is simply another approach, not followed by most geneticists so far…
In the latest issue of JRSS Series B (74(1), Jan, 2012), I just noticed that no paper is “from my time” as co-editor, i.e. that all of them have been submitted after I completed my term in Jan. 2010. Given the two year delay, this is not that surprising, but it also means I can make comments on some papers w/o reservation! A paper I had seen earlier (as a reader, not as an editor nor as a referee!) is Petros Dellaportas’ and Ioannis Kontoyiannis’ Control variates for estimation based on reversible Markov chain Monte Carlo samplers. The idea is one of post-processing MCMC output, by stabilising the empirical average via control variates. There are two difficulties, one in finding control variates, i.e. functions $\Psi(\cdot)$ with zero expectation under the target distribution, and another one in estimating the optimal coefficient in a consistent way. The paper solves the first difficulty by using the Poisson equation, namely that G(x)-KG(x) has zero expectation under the stationary distribution associated with the Markov kernel K. Therefore, if KG can be computed in closed form, this is a generic control variate taking advantage of the MCMC algorithm. Of course, the above if is a big if: it seems difficult to find closed form solutions when using a Metropolis-Hastings algorithm for instance and the paper only contains illustrations within the conjugate prior/Gibbs sampling framework. The second difficulty is also met by Dellaportas and Kontoyiannis, who show that the asymptotic variance of the resulting central limit can be equal to zero in some cases.
Following my reading the discussions of the Read Paper by Fearnhead and Prangle, I included some of their points in my course this morning. Which ended up with me spending the whole two hours on this topic (and finally getting a grasp on calibration!). Here is [hopefully] the final version of the slides.
This afternoon, I will attend the Read Paper session in London, presented by Paul Fearnhead and Dennis Prangle on semi-automatic ABC. I have already commented the paper (as a referee, external examiner and blogger!) and provided my slides for our local pre-ordinary meeting at CREST, so here is my written discussion (maybe to be turned into discussions due to its length!). (I just hope my flight from the US won’t be cancelled or overly delayed…)