## dimension reduction in ABC [a review's review]

Posted in Statistics, University life with tags , , , , , , , , , , , on February 27, 2012 by xi'an

What is very apparent from this study is that there is no single `best’ method of dimension reduction for ABC.

Michael Blum, Matt Nunes, Dennis Prangle and Scott Sisson just posted on arXiv a rather long review of dimension reduction methods in ABC, along with a comparison on three specific models. Given that the choice of the vector of summary statistics is presumably the most important single step in an ABC algorithm and as selecting too large a vector is bound to fall victim of the dimension curse, this is a fairly relevant review! Therein, the authors compare regression adjustments à la Beaumont et al.  (2002), subset selection methods, as in Joyce and Marjoram (2008), and projection techniques, as in Fearnhead and Prangle (2012). They add to this impressive battery of methods the potential use of AIC and BIC. (Last year after ABC in London I reported here on the use of the alternative DIC by Francois and Laval, but the paper is not in the bibliography, I wonder why.) An argument (page 22) for using AIC/BIC is that either provides indirect information about the approximation of p(θ|y) by p(θ|s); this does not seem obvious to me.

The paper also suggests a further regularisation of Beaumont et al.  (2002) by ridge regression, although L1 penalty à la Lasso would be more appropriate in my opinion for removing extraneous summary statistics. (I must acknowledge never being a big fan of ridge regression, esp. in the ad hoc version à la Hoerl and Kennard, i.e. in a non-decision theoretic approach where the hyperparameter λ is derived from the data by X-validation, since it then sounds like a poor man’s Bayes/Stein estimate, just like BIC is a first order approximation to regular Bayes factors… Why pay for the copy when you can afford the original?!) Unsurprisingly, ridge regression does better than plain regression in the comparison experiment when there are many almost collinear summary statistics, but an alternative conclusion could be that regression analysis is not that appropriate with  many summary statistics. Indeed, summary statistics are not quantities of interest but data summarising tools towards a better approximation of the posterior at a given computational cost… (I do not get the final comment, page 36, about the relevance of summary statistics for MCMC or SMC algorithms: the criterion should be the best approximation of p(θ|y) which does not depend on the type of algorithm.)

I find it quite exciting to see the development of a new range of ABC papers like this review dedicated to a better derivation of summary statistics in ABC, each with different perspectives and desideratas, as it will help us to understand where ABC works and where it fails, and how we could get beyond ABC…

## Bayesian variable selection [off again]

Posted in Statistics, University life with tags , , , , , , on November 16, 2011 by xi'an

As indicated a few weeks ago, we have received very encouraging reviews from Bayesian Analysis about our [Gilles Celeux, Mohammed El Anbari, Jean-Michel Marin and myself] our comparative study of Bayesian and non-Bayesian variable selections procedures (“Regularization in regression: comparing Bayesian and frequentist methods in a poorly informative situation“) to Bayesian Analysis. We have just  rearXived and resubmitted it with additional material and hope this is the last round. (I must acknowledge a limited involvement at this final stage of the paper. Had I had more time available, I would have liked to remove the numerous tables and turn them into graphs…)

## Bayesian modeling using WinBUGS

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , on November 7, 2011 by xi'an

Yes, yet another Bayesian textbook: Ioannis Ntzoufras’ Bayesian modeling using WinBUGS was published in 2009 and it got an honourable mention at the 2009 PROSE Award. (Nice acronym for a book award! All the mathematics books awarded that year were actually statistics books.) Bayesian modeling using WinBUGS is rather similar to the more recent Bayesian ideas and data analysis that I reviewed last week and hence I am afraid the review will draw a comparison between both books. (Which is a bit unfair to Bayesian modeling using WinBUGS since I reviewed Bayesian ideas and data analysis  on its own! However, I will presumably write my CHANCE column as a joint review.)

As history has proved, the main reason why Bayesian theory was unable to establish a foothold as a well accepted quantitative approach for data analysis was the intractability involved in the calculation of the posterior distribution.” Chap. 1, p.1

The book launches into a very quick introduction to Bayesian analysis, since, by page 15, we are “done” with linear regression and conjugate priors. This is somehow softened by the inclusion at the end of the chapter of a few examples, including one on the Greek football  team in Euro 2004, but nothing comparable with Christensen et al.’s initial chapter of motivating examples. Chapter 2 on MCMC methods follows the same pattern:  a quick and dense introduction in about ten pages, followed by 40 pages of illuminating examples, worked out in full detail. CODA is described in an Appendix. Compared with Bayesian ideas and data analysis, Bayesian modeling using WinBUGS spends time introducing WinBUGS and Chapter 3 acts like a 20 page user manual, while Chapter 4 corresponds to the WinBUGS example manual. Chapter 5 gets back to a more statistical aspect, the processing of regression models (including Zellner’s g-prior). up to ANOVA. Chapter 6 extends the previous chapter to categorical variables and the ANCOVA model, as well as the 2006-2007 English premier league. Chapter 7 moves to the standard generalised linear models, with an extension in Chapter 8 to count data, zero inflated models, and survival data. Chapter 9 covers hierarchical models, with mixed models, longitudinal data, and the water polo World Cup 2000. Read more »

## Another ABC rebuttal

Posted in Statistics, University life with tags , , , , on October 31, 2010 by xi'an

“Given that some logical overlap is common when dealing with complex models, this means that much of the literature using ABC is invalid.” Alan Templeton, July 2010.

I had not noticed another reply to Templeton’s PNAS diatribe against ABC that was published by Csilléry, Blum, Gaggiotti and François in Trends in Ecology and Evolution. This reply follows a letter written by Templeton to this journal and published last July. Alan Templeton takes issue with the inclusion of a box in the nice survey of Csilléry et al. entitled Controversy surrounding ABC. The letter reproduces earlier arguments I already discussed, in particular the “logical impossibility” to have larger models enjoying smaller posterior probabilities than smaller models [that are special cases]. The conclusion that

“1) ABC can and does produce results that are mathematically impossible; 2) the ‘posterior probabilities’ of ABC cannot possibly be true probability measures; and 3) ABC is statistically incoherent (incoherent methods can violate the constraints of formal logic)” Alan Templeton, July 2010.

is thus bringing no novelty to the debate. It is nonetheless mildly irritating to see that Alan Templeton is still advancing “mathematical errors” as his main argument, despite detailed rebuttals published by mathematicians and mathematical statisticians. As demonstrated by the repeated argument that BIC should replace ABC (!), or the decomposition of $P(A\cup B\cup C)$ in the PNAS reply,  he is out of his depth on mathematical grounds. However, that he manages to publish a paper like the PNAS diatribe without the journal having a mathematician checking the “mathematical flaws” is more of an issue.

## Regularisation

Posted in Statistics, University life with tags , , , , , , , , on October 5, 2010 by xi'an

After a huge delay, since the project started in 2006 and was first presented in Banff in 2007 (as well as included in the Bayesian Core), Gilles Celeux,  Mohammed El Anbari, Jean-Michel Marin, and myself have eventually completed our paper on using hyper-g priors variable selection and regularisation in linear models . The redaction of this paper was mostly delayed due to the publication of the 2007 JASA paper by Feng Liang, Rui Paulo, German Molina, Jim Berger, and Merlise Clyde, Mixtures of g-priors for Bayesian variable selection. We had indeed (independently) obtained very similar derivations based on hypergeometric function representations but, once the above paper was published, we needed to add material to our derivation and chose to run a comparison study between Bayesian and non-Bayesian methods for a series of simulated and true examples. It took a while to Mohammed El Anbari to complete this simulation study and even longer for the four of us to convene and agree on the presentation of the paper. The only difference between Liang et al.’s (2007) modelling and ours is that we do not distinguish between the intercept and the other regression coefficients in the linear model. On the one hand, this gives us one degree of freedom that allows us to pick an improper prior on the variance parameter. On the other hand, our posterior distribution is not invariant under location transforms, which was a point we heavily debated in Banff… The simulation part shows that all “standard” Bayesian solutions lead to very similar decisions and that they are much more parsimonious than regularisation techniques.

Two other papers posted on arXiv today address the model choice issue. The first one by Bruce Lindsay and Jiawei Liu introduces a credibility index, and the second one by Bazerque, Mateos, and Giannakis considers group-lasso on splines for spectrum cartography.