My description is clumsy because of my lack of familiarity with terms, but I do understand the algorithm. Your answer that this is not an apples-oranges comparison at the level of different (“large” and “small”) models is the key point for me. I need to do some more reading to understand mathematically why you are correct — the Molecular Ecology paper is unequivocal in this claim, but unfortunately doesn’t make it clear which references would help to explain why it is so. The references you cited in your review of Sober seem promising and I will consult those.

I have my own hypothesis on the last point — the “special” cases fail to “win” in the algorithmic treatment of the model that seems to contain them, because the authors assumed boundaries to the parameters that excluded the special cases. This is a problem with assumptions, not methods, and fortunately it’s testable.

]]>Fagundes

The posterior probabilities of the model or equivalently the marginal likelihoods of the summary statistic integrate the whole Bayesian modelling, including the priors. As explained in our Molecular Ecology paper, this means they are (a) statistically comparable, not

As to the latest of your points, about

Thanks, for taking the time to explain. I think this is the point that confuses me.

Each of their differently-parametered models is run through simulations to find the parameter values that maximize the posterior probability, assuming uniform (or log-uniform) priors *within that parameterization*. That seems properly Bayesian to me, although I believe they constrained their parameter ranges in ways that systematically excluded relevant regions of the parameter space.

Now they have eight models, each representing the maximum posterior, given the data, from a particular parameterization.

So far, so good. But how to choose which of the eight parameterizations we should accept? The authors chose the one with the highest posterior probability. These are apples-oranges comparisons, to which the authors apply no priors at all, which is to say identical priors.

I could imagine that the calculation of the posteriors had already penalized the models with more paramaters. But this can’t be true in general, because *some* of the fewer-parameter models are logically special cases of the more-parameter models — so if a fewer-parameter model really had a higher posterior under the same priors, it should *already* have been chosen in the first round of the analysis!

I would hope that this submission by Templeton will finally convince PNAS or its contributors/readers that the journal cannot be held in high regard while great peer-reviewed work is forced to appear side by side with with misleading/misled nonscience.

]]>(b) In Fagundes et al. (2007) the number of parameters varies among the eight models, so the prior distributions on those parameters cannot be the same. (A uniform prior on is not the same as a uniform prior on .) Putting the same prior weights on all eight models is another thing (which amounts to using the Bayes factor). ]]>

I am relatively unschooled in Bayesian methods and therefore may be missing something obvious. But Fagundes et al. (2007), which Templeton uses as his main example, did not claim to use a BIC criterion.

Nor do I see any sense in which the priors for the “large” and “small” models are anything but equal. The paper took the “large” model with the highest posterior among “larges”, and the “small” model with the highest posterior among “smalls”, and decided to accept the “small” model without any discussion of priors at all — which amounts to the assumption that their prior probability is identical. *Mathematically impossible* it may be, but that’s what the paper assumed!