Incoherent phylogeographic inference

“In statistics, coherent measures of fit of nested and overlapping composite hypotheses are technically those measures that are consistent with the constraints of formal logic. For example, the probability of the nested special case must be less than or equal to the probability of the general model within which the special case is nested. Any statistic that assigns greater probability to the special case is said to be incoherent. An example of incoherence is shown in human evolution, for which the approximate Bayesian computation (ABC) method assigned a probability to a model of human evolution that was a thousand-fold larger than a more general model within which the first model was fully nested. Possible causes of this incoherence are identified, and corrections and restrictions are suggested to make ABC and similar methods coherent.” Alan R. Templeton, PNAS, doi:10.1073/pnas.0910647107

Following the astounding publication of Templeton’s pamphlet against Bayesian inference in PNAS last March, Jim Berger, Steve Fienberg, Adrian Raftery and myself polished a reply focussing on the foundations of statistical testing in Benidorm and submitted a letter to the journal. Here are the (500 word) contents.

Templeton (2010, PNAS) makes a broad attack on the foundations of Bayesian statistical methods—rather than on the purely numerical technique called Approximate Bayesian Computation (ABC)—using incorrect arguments and selective references taken out of context.  The most significant example is the argument ``The probability of the nested special case must be less than or equal to the probability of the general model within which the special case is nested. Any statistic that assigns greater probability to the special case is incoherent. An example of incoherence is shown for the ABC (sic!) method.” This opposes both the basis and the practice of Bayesian testing.

The confusion seems to arise from misunderstanding the difference between scientific hypotheses and their mathematical representation. Consider vaccine testing,  where in what follows we use VE to represent the vaccine efficacy measured on a scale from -\infty to 100.  Exploratory vaccines may be efficacious or not.  Thus a real biological model corresponds to the hypothesis “VE=0″, that the vaccine is not efficacious.  The alternative biological possibility, that the vaccine has an effect, is often stated mathematically as the alternative model “any allowed value of VE is possible,” making it appear that it contains “VE=0.” But Bayesian analysis assigns each model prior distributions arising from the background science; a point mass (e.g. probability 1/2) is assigned to “VE=0″ and the remaining probability mass (e.g. 1/2) is distributed continuously over values of VE in the alternative model. Elementary use of Bayes’ theorem (see, e.g., Berger, 1985, Statistical Decision Theory and Bayesian Analysis) then shows that the simpler model can indeed have a much higher posterior probability. Mathematically, this is explained by the  probability distributions residing in different dimensional spaces, and is elementary probability theory for which use of Templeton’s “Venn diagram argument” is simply incorrect.

Templeton also argues that Bayes factors are mathematically incorrect, and he backs his claims with Lavine and Schervish’s (1999, American Statistician) notion of coherence. These authors do indeed criticize the use of Bayes factors as stand-alone criteria but point out that, when combined with prior probabilities of models (as illustrated in the vaccine example above), the result is fully coherent posterior probabilities. Further, Templeton directly attacks the ABC algorithm.  ABC is simply a numerical computational technique; attacking it as incoherent is similar to calling calculus incoherent if it is used to compute the wrong thing.

Finally, we note that Templeton has already published essentially identical if more guarded arguments in the ecology literature; we refer readers to a related rebuttal to Templeton’s (2008, Molecular Ecology) critique of the Bayesian approach by Beaumont et al. (2010, Molecular Ecology) that is broader in scope, since it also covers the phylogenetic aspects of nested clade versus a model-based approach.

The very first draft I had written on this paper, in conjunction with my post, has been submitted to posted on arXiv this morning.

8 Responses to “Incoherent phylogeographic inference”

  1. […] ABC in dozens of genetic papers. Further arguments are provided in the various replies to both of Templeton’s radical criticisms. That more empirical and model-based assessments also are available is quite correct, as […]

  2. […] not even though of”  (p.37). This sounds like a weak argument, although it was also used by Alan Templeton in his rebuttal of ABC, given that (a) it should also apply in the frequentist sense, in order to […]

  3. […] be. However, I removed completely the computational side and included instead some comments on the ABC controversy and from my recent review of Murray Aitkin’s Statistical […]

  4. […] that I will draw from my recent reading of Aitkin’s book, as well as from the controversy with […]

  5. […] had not noticed another reply to Templeton’s PNAS diatribe against ABC published by Csilléry, Blum, Gaggiotti and […]

  6. […] letter in PNAS about Templeton’s surprising diatribe on Bayesian inference is now appeared in the early […]

  7. […] phylogeographic inference [accepted] The letter we submitted to PNAS about Templeton’s surprising diatribe on Bayesian inference has now been accepted: […]

  8. FOULLEY Jean-Louis Says:

    Templeton’s attack but with a complete misunderstanding of the principles of Bayesian statistics as you clearly explained it, highlights how many minds are confused about the procedures and criteria employed for comparing competing models.

    Probably what could be difficult to understand by non professional statisticians is that the probability of theta1=0 (ie condition for the reduced model) vanishes to zero under the larger model (continuous theta1) embedding the reduced one.

    One reason also why some people might believe in Templeton’s incoherence arguments that larger models are necessarily “better” lies in the fact that the maximum value of the (classical) likelihood cannot decrease from a reduced to more complete models when they are nested.

    This and the confusion between standard Bayes and ABC also reminds me a sentence by Bradley Efron I guess : “Applying Bayesian statistics does not make you a Bayesian” indicating that there is a big gap between using numerical techniques and understanding their foundations.

    In addition, the domain of applications tackled by Templeton in his PNAS article ie origin of man is not at all neutral both scientifically and philosophically speaking and this might be an other reason for making this field especially sensitive to disputes on competing theories. I am not sure that a tough debate around the relevance of Bayesian statistics would have taken place from a paper on the origins of cows.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s