Evidence and evolution (4)
“Darwinians would not be satisfied if all life on Earth derived from the same large slab of rock.” (E&E, p.269)
Thanks to Eyjafjallajökull, I used the three and a half hours in the train back from Marseille to conclude my lecture of Sober’s Evidence and Evolution: The Logic Behind the Science, The final chapter (apart from the concluding summary) is about “Common ancestry” and may be the most statistically oriented of the three chapters about evolution. This is not to say the chapter is without defaults, including in particular a certain tendency to repeat the same arguments. but this is somehow the chapter I appreciated the most. The chapter starts with a detailed analysis on how the hypothesis of common ancestry should be set, the main distinction being between one organism and several, while pointing out the confusing effect of lateral gene transfer. Inference about phylogenetic trees and the use of genetic sequences rather than simplistic traits gets us closer to the true issues at stake. Another interesting feature of this chapter is the relation to Darwin’s reflections on the common origin of life on Earth through many quotes.
“If those prior probabilities are obscure, the same will be true of the posterior probabilities.” (E&E, p.277)
The statistical issue is thus of testing for a common ancestor versus separate ancestors for a set of organisms. The nature of the information contained in the data is never made precise enough to understand whether this fits the principle of total evidence stressed throughout the book. The chapter also shows a more lenient disposition towards Bayesian solutions but Section 4.3 ends up with an impossibility statement, due to the impossibility of defining an objective prior because Sober wants prior probabilities that have some authority. This is a self-defeating constraint leading to empirically well-grounded priors.
“Those propositions suffice for similarity to be evidence for common ancestry, and they have broad applicability.” (E&E, p.283)
The part about Reichenbach’s (1956) sufficient condition for a common trait to induce a likelihood ratio larger than one in favour of the continuous ancestor hypothesis needs to be discussed as this is the point I find the most puzzling in the chapter. Indeed, most of the nine assumptions of Reichenbach (1956) relates both models under comparison, i.e. common ancestry versus separate ancestry. This seems to me to be a weird thing to do as models under comparison should not share all of their parameters! For instance, if we build a Bayesian model to compare those models, we would use a prior distribution on each group of parameters. Having a common parameter does not make sense since we end up selecting one of the two models. I wonder if this is the result of a reluctance to have true parameters as in a regular statistical analysis. (See, e.g., the lament that “until values for adjustable parameters are specified, we cannot talk about the probability of the data under different hypotheses”, p.338.) What is striking is the reliance of the whole chapter on this unnatural set of hypotheses since it keeps resurfacing throughout the chapter. Sober writes that Propositions 1-9 are not consequences of the axioms of probability. Neither are they necessary conditions for common ancestry to have a higher likelihood than separate ancestry (p.283). Nonetheless, this is creating a unnecessary bias in the perception of the problem which may induce critics of evolution to reject the whole approach.
“If there was no such common ancestor, what would alignment ever mean?” (E&E, p.291)
The theme of the missing model I have alluded to in the previous posts is also recurrent in this chapter. There are a lot of paragraphs about the choice of the representation of the difference between two species, from trait to gene sequence, and the author acknowledges that the difficulty in this choice has to do with a requirement for a more advanced theoretical representation (model) adapted to more complex data. This sounds rather obvious stated that way but the book wanders around this point for pages! (An example is the above quote that misses the point about sequence alignment: this is a perfectly well-defined measure of distance, common ancestor or not.) And the overall conclusion is a vague call for the principle of total evidence (which is a rephrasing of the likelihood principle). As illustrated in the section on multiple characters, the discussion is confusing without a proper model. It is only on page 300 of the book that a completely defined model for the evolution of a dichotomous trait (i.e. the simplest possible case) appears. This model is a rather crude tool, as it depends on arbitrary calibration factors like instead of 1 and, more importantly, on an unspecified time (as in “what time is it on the evolution clock?“). The corresponding likelihood ratio is then (under one of the selection schemes)
where the dependence on those factors is obvious. This illustrates the impossibility to reach a satisfactory conclusion without going first through a statistical analysis of the problem.
“It is possible for data to discriminate among a set of hypotheses without saying anything about a proposition that is common to all the alternatives considered” (E&E, p.315)
The debate about the phylogenetic tree reconstruction versus the test for common ancestry (Sections 4.7 and 4.8) lacks appeals for the very reason exposed above. The tree structure may be incorporated within the model(s) and integrated out in a Bayesian fashion to provide the marginal likelihood of the model(s). Although this seems to be an important issue, as illustrated by the controversy with Templeton, the opposition between likelihood inference and “cladistic” parsimony is not properly conducted in that, as a naïve reader, I cannot understand Sober’s presentation of the later. This section is much more open to Bayesian processing by abstaining from the usual criticism about the lack of objectivity of the prior selection, but it entirely misses the ability of the Bayesian approach to integrate out the nuisance parameters, whether they are the tree topology (standard marginalisation) or the model index (model averaging). The debate about the limited meaning of statistical consistency is making the valid point that consistency only puts light on the case when the hypothesised model is true, but extended consistency could have been considered as well, namely that the procedure will bring the hypothesised model as close as possible to the “true” model within the hypothesised family of models. What I gather from this final section is that cladistic parsimony tries to do without models (if not without assumptions), which seems to relate to Templeton’s views about Bayesian inference.
Again, this is certainly the most enjoyable chapter of the book from my point of view (besides the nice recap about methods of inference in Chapter1), even though the lack of real illustrations makes it less potent than it could be. It also shows the limitation of a philosophical debate on simplistic idealisations of the real model. The book only acknowledges on page 334 that genealogical hypotheses are composite. Better late than never, but I think that an incorporation of the parameter estimation in the inferential process would not have hurt the quality of the debate.