Incoherent phylogeographic inference
“In statistics, coherent measures of fit of nested and overlapping composite hypotheses are technically those measures that are consistent with the constraints of formal logic. For example, the probability of the nested special case must be less than or equal to the probability of the general model within which the special case is nested. Any statistic that assigns greater probability to the special case is said to be incoherent. An example of incoherence is shown in human evolution, for which the approximate Bayesian computation (ABC) method assigned a probability to a model of human evolution that was a thousand-fold larger than a more general model within which the first model was fully nested. Possible causes of this incoherence are identified, and corrections and restrictions are suggested to make ABC and similar methods coherent.” Alan R. Templeton, PNAS, doi:10.1073/pnas.0910647107
Following the astounding publication of Templeton’s pamphlet against Bayesian inference in PNAS last March, Jim Berger, Steve Fienberg, Adrian Raftery and myself polished a reply focussing on the foundations of statistical testing in Benidorm and submitted a letter to the journal. Here are the (500 word) contents.
Templeton (2010, PNAS) makes a broad attack on the foundations of Bayesian statistical methods—rather than on the purely numerical technique called Approximate Bayesian Computation (ABC)—using incorrect arguments and selective references taken out of context. The most significant example is the argument ``The probability of the nested special case must be less than or equal to the probability of the general model within which the special case is nested. Any statistic that assigns greater probability to the special case is incoherent. An example of incoherence is shown for the ABC (sic!) method.” This opposes both the basis and the practice of Bayesian testing.
The confusion seems to arise from misunderstanding the difference between scientific hypotheses and their mathematical representation. Consider vaccine testing, where in what follows we use VE to represent the vaccine efficacy measured on a scale from to 100. Exploratory vaccines may be efficacious or not. Thus a real biological model corresponds to the hypothesis “VE=0″, that the vaccine is not efficacious. The alternative biological possibility, that the vaccine has an effect, is often stated mathematically as the alternative model “any allowed value of VE is possible,” making it appear that it contains “VE=0.” But Bayesian analysis assigns each model prior distributions arising from the background science; a point mass (e.g. probability 1/2) is assigned to “VE=0″ and the remaining probability mass (e.g. 1/2) is distributed continuously over values of VE in the alternative model. Elementary use of Bayes’ theorem (see, e.g., Berger, 1985, Statistical Decision Theory and Bayesian Analysis) then shows that the simpler model can indeed have a much higher posterior probability. Mathematically, this is explained by the probability distributions residing in different dimensional spaces, and is elementary probability theory for which use of Templeton’s “Venn diagram argument” is simply incorrect.
Templeton also argues that Bayes factors are mathematically incorrect, and he backs his claims with Lavine and Schervish’s (1999, American Statistician) notion of coherence. These authors do indeed criticize the use of Bayes factors as stand-alone criteria but point out that, when combined with prior probabilities of models (as illustrated in the vaccine example above), the result is fully coherent posterior probabilities. Further, Templeton directly attacks the ABC algorithm. ABC is simply a numerical computational technique; attacking it as incoherent is similar to calling calculus incoherent if it is used to compute the wrong thing.
Finally, we note that Templeton has already published essentially identical if more guarded arguments in the ecology literature; we refer readers to a related rebuttal to Templeton’s (2008, Molecular Ecology) critique of the Bayesian approach by Beaumont et al. (2010, Molecular Ecology) that is broader in scope, since it also covers the phylogenetic aspects of nested clade versus a model-based approach.