## Review in Human Genomics

Posted in Books, Statistics with tags , , on February 12, 2011 by xi'an

My review of Sober’s Evidence and Evolution: The Logic Behind the Science first polished on the ‘Og just got published in Human Genomics (vol. 5, number 2, pp. 130-136). This is my very first publication in this journal and I am very glad (and grateful to the book editor) to have had the opportunity to keep my review to its original seven pages in the journal. (Here is a copy on my webpage in case access to the journal is impossible.)

## Rehab!

Posted in Statistics, University life with tags , , , , on June 24, 2010 by xi'an

Dear arXiv moderator,
I would like to ask for a reinstatement of my 1004.5074 within the stat.ME category as it has nothing to do with applied statistics. The book by Sober and hence my critical reading is about the philosophy of testing (or evidence) and hence relates to the foundations of statistics, which I think are within the stat.ME (or stat.TH?) category.
Sincerely, X.

and then yesterday I received this email

Our moderators have considered your appeal and have agreed to reclassify your article as stat.ME (Methodology) with cross-lists to stat.TH (Theory) and q-bio.PE (Populations and Evolution). No further action is required on your part.
arXiv moderation

Not that it mattered that much, but being put into the “wrong” group meant that less people were likely to get a look at this review. Overall, I keep being impressed by the efficiency of the arXiv service. (Maybe also because it is housed by my alma matter Cornell University).

## Evidence and evolution (5)

Posted in Books, Statistics with tags , , , , , , , , , , , , on April 29, 2010 by xi'an

“Tout étant fait pour une fi n, tout est nécessairement pour la meilleure fi n. Remarquez bien que les nez ont été faits pour porter des lunettes, aussi avons-nous des lunettes.” Voltaire, Candide, Chapitre 1.

I am now done with my review of Sober’s Evidence and Evolution: The Logic Behind the Science, Posting about each chapter along the way helped me a lot to write down the review over the past few days. Its conclusion is that

Evidence and Evolution is very well-written, with hardly any typo (the unbiasedness property of AIC is stated at the bottom of page 101 with the expectation symbol E on the wrong side of the equation, Figure 3.8c is used instead of Figure 3.7c on page 204, Figure 4.7 is used instead of Figure 4.8 on page 293, Simon Tavaré’s name is always spelled Taveré, vaules rather than values is repeated four times on page 339). The style is sometimes too light and often too verbose, with an abundance of analogies that I regard as sidetracking, but this makes for an easier reading (except for the sentence “the key to answering the second question is that the observation that X = 1 and Y = 1 produces stronger evidence favoring CA over SA the lower the probability is that the ancestors postulated by the two hypotheses were in state 1”, on page 314, that still eludes me!). As detailed in this review, I have points of contentions with the philosophical views about testing in Evidence and Evolution as well as about the methods exposed therein, but this does not detract from the appeal of reading the book. (The lack of completely worked out statistical hypotheses in realistic settings remains the major issue in my criticism of the book.) While the criticisms of the Bayesian paradigm are often shallow (like the one on page 97 ridiculing Bayesians drawing inference based on a single observation), there is nothing fundamentally wrong with the statistical foundations of the book. I therefore repeat my earlier recommendation in favour of Evidence and Evolution, Chapters 1 and (paradoxically) 5 being the easier entries. Obviously, readers familiar with Sober’s earlier papers and books will most likely find a huge overlap with those but others will gather Sober’s viewpoints on the notion of testing hypotheses in a (mostly) unified perspective.

And, as illustrated by the above quote, I found the sentence from Voltaire’s Candide I wanted to include. Of course, this 12 page review may be overly long for the journal it was intended for, Human Genetics, in which case I will have to find another outlet for the current arXived version. But I enjoyed reading this book with a pencil and gathered enough remarks along the way to fill those twelve pages.

## Evidence and evolution (4)

Posted in Books, Statistics with tags , , , , on April 26, 2010 by xi'an

“Darwinians would not be satisfied if all life on Earth derived from the same large slab of rock.” (E&E, p.269)

Thanks to Eyjafjallajökull, I used the three and a half hours in the train back from Marseille to conclude my lecture of Sober’s Evidence and Evolution: The Logic Behind the Science, The final chapter (apart from the concluding summary) is about “Common ancestry” and may be the most statistically oriented of the three chapters about evolution. This is not to say the chapter is without defaults, including in particular a certain tendency to repeat the same arguments. but this is somehow the chapter I appreciated the most. The chapter starts with a detailed analysis on how the hypothesis of common ancestry should be set, the main distinction being between one organism and several, while pointing out the confusing effect of lateral gene transfer.  Inference about phylogenetic trees and the use of genetic sequences rather than simplistic traits gets us closer to the true issues at stake. Another interesting feature of this chapter is the relation to Darwin’s reflections on the common origin of life on Earth through many quotes.

“If those prior probabilities are obscure, the same will be true of the posterior probabilities.” (E&E, p.277)

The statistical issue is thus of testing for a common ancestor versus separate ancestors for a set of organisms. The nature of the information contained in the data is never made precise enough to understand whether this fits the principle of total evidence stressed throughout the book. The chapter also shows a more lenient disposition towards Bayesian solutions but Section 4.3 ends up with an impossibility statement, due to the impossibility of defining an objective prior because Sober wants prior probabilities that have some authority. This is a self-defeating constraint leading to empirically well-grounded priors.

“Those propositions suffice for similarity to be evidence for common ancestry, and they have broad applicability.” (E&E, p.283)

The part about Reichenbach’s (1956) sufficient condition for a common trait to induce a likelihood ratio larger than one in favour of the continuous ancestor hypothesis needs to be discussed as this is the point I find the most puzzling in the chapter. Indeed, most of the nine assumptions of Reichenbach (1956) relates both models under comparison, i.e. common ancestry versus separate ancestry. This seems to me to be a weird thing to do as models under comparison should not share all of their parameters! For instance, if we build a Bayesian model to compare those models, we would use a prior distribution on each group of parameters. Having a common parameter does not make sense since we end up selecting one of the two models. I wonder if this is the result of a reluctance to have true parameters as in a regular statistical analysis.  (See, e.g., the lament that “until values for adjustable parameters are specified, we cannot talk about the probability of the data under different hypotheses”, p.338.) What is striking is the reliance of the whole chapter on this unnatural set of hypotheses since it keeps resurfacing throughout the chapter. Sober writes that Propositions 1-9 are not consequences of the axioms of probability. Neither are they necessary conditions for common ancestry to have a higher likelihood than separate ancestry (p.283). Nonetheless, this is creating a unnecessary bias in the perception of the problem which may induce critics of evolution to reject the whole approach.

“If there was no such common ancestor, what would alignment ever mean?” (E&E, p.291)

The theme of the missing model I have alluded to in the previous posts is also recurrent in this chapter. There are a lot of paragraphs about the choice of the representation of the difference between two species, from trait to gene sequence, and the author acknowledges that the difficulty in this choice has to do with a requirement for a more advanced theoretical representation (model) adapted to more complex data. This sounds rather obvious stated that way but the book wanders around this point for pages! (An example is the above quote that misses the point about sequence alignment: this is a perfectly well-defined measure of distance, common ancestor or not.) And the overall conclusion is a vague call for the principle of total evidence (which is a rephrasing of the likelihood principle). As illustrated in the section on multiple characters, the discussion is confusing without a proper model. It is only on page 300 of the book that a completely defined model for the evolution of a dichotomous trait (i.e. the simplest possible case) appears. This model is a rather crude tool, as it depends on arbitrary calibration factors like $P(Z=0)=0.99$ instead of 1 and, more importantly, on an unspecified time (as in “what time is it on the evolution clock?“). The corresponding likelihood ratio is then (under one of the selection schemes)

$\dfrac{0.01b_t^2 + 0.99}{[0.01b_t+0.99]^2}$

where the dependence on those factors is obvious. This illustrates the impossibility to reach a satisfactory conclusion without going first through a statistical analysis of the problem.

“It is possible for data to discriminate among a set of hypotheses without saying anything about a proposition that is common to all the alternatives considered” (E&E, p.315)

The debate about the phylogenetic tree reconstruction versus the test for common ancestry (Sections 4.7 and 4.8) lacks appeals for the very reason exposed above. The tree structure may be incorporated within the model(s) and integrated out in a Bayesian fashion to provide the marginal likelihood of the model(s). Although this seems to be an important issue, as illustrated by the controversy with Templeton, the opposition between likelihood inference and “cladistic” parsimony is not properly conducted in that, as a naïve reader, I cannot understand Sober’s presentation of the later. This section is much more open to Bayesian processing by abstaining from the usual criticism about the lack of objectivity of the prior selection, but it entirely misses the ability of the Bayesian approach to integrate out the nuisance parameters, whether they are the tree topology (standard marginalisation) or the model index (model averaging). The debate about the limited meaning of statistical consistency is making the valid point that consistency only puts light on the case when the hypothesised model is true, but extended consistency could have been considered as well, namely that the procedure will bring the hypothesised model as close as possible to the “true” model within the hypothesised family of models. What I gather from this final section is that cladistic parsimony tries to do without models (if not without assumptions), which seems to relate to Templeton’s views about Bayesian inference.

Again, this is certainly the most enjoyable chapter of the book from my point of view (besides the nice recap about methods of inference in Chapter1), even though the lack of real illustrations makes it less potent than it could be. It also shows the limitation of a philosophical debate on simplistic idealisations of the real model. The book only acknowledges on page 334 that genealogical hypotheses are composite. Better late than never, but I think that an incorporation of the parameter estimation in the inferential process would not have hurt the quality of the debate.

## Evidence and evolution (3)

Posted in Books, Statistics with tags , , , , , , , on April 17, 2010 by xi'an

“To test a theory, you need to test it against alternatives.” (E&E, p.190)

After a gruesome (!) trek through Chapter 3 of Sober’s Evidence and Evolution: The Logic Behind the Science, I am now done with this chapter entitled “Natural selection”. The chapter is difficult to read (for someone like me) in that it seems overly repetitive, using somehow obvious arguments while missing clearcut conclusions and directions. This bend must be due to the philosophical priorities of the author but, despite opposing Brownian motion to Ornstein-Uhlenbeck processes at the beginning of the chapter —which would make for a neat parametric model comparison setting—, there is no quantitative argument nor illustration found in this third chapter that would relate to statistics. This is unfortunate as the questions of interest (testing for natural selection versus pure drift or versus phylogenetic inertia or yet for tree structure in phylogenics) could clearly be conducted at a numerical level as well, through the AIC factor or through a Bayesian alternative. The aspects I found most interesting in this chapter may therefore be deemed as marginalia by most readers, namely (a) the discussion that the outcome of a test should at all depend on the modelling assumptions (the author seems to doubt this, hence relegating Bayesian techniques to their dust-gathering shelves!), and (b) the point that parsimony is not a criterion per se.

“Data! Data! Data!’ he cried impatiently, I cannot make bricks without clay!” (Sherlock Holmes, The adventure of the copper beeches)

About the first point, the philosophical stance of the author is not completely foolproof in that he concedes that testing hypotheses without accounting for the alternative is not acceptable. My impression is that he looks at the problem from a purely dichotomous perspective, the hypothesis or [exclusive OR] the alternative being true. This is a bit caricatural as he integrates the issue of calibrating parameters under the different hypotheses, but there is a sort of logical discrepancy lurking in the background of the argument. Again working out a fully Bayesian analysis of a philogenic tree say would have clarified the issue immensely! And rejecting Bayesianism (sic!) because “there is no objective basis for producing an answer” (p.239) is a wee limited on the epistemological side! Even though I understand that the book is not trying to debate about the support for a specific evolutionary hypothesis but rather about the methods used to test such hypotheses and the logic behind these, completely worked-out example would have made my appreciation (and maybe other readers’) of Sober’s points much easier. And, again, I fail to see who could take benefits from reading this chapter. A biologist will most likely integrate the arguments and illustrations provided by Sober but could leave the chapter with a feeling of frustration at the apparent lack of conclusion. (As a statistician, I fail to understand how the likelihoods repeatedly mentioned by Sober can be computed because they never involve any parameter.)

“Parsimony does not provide a justification for ignoring the data.” (E&E, p.250)

Since I am interested in general by the negative impact of the “Ockham’s razor” argument, I find the warning signals about parsimony (given in the last third of the chapter) more palatable. Parsimony being an ill-defined concept, especially from a statistical perspective —where even the dimension of the parameter space is debatable—, no model selection is acceptable if only based on this argument.

“Instead of evaluating hypotheses in terms of how probable they say the data are, we evaluate them by estimating how accurately they’ll predict new data when fitted to old.” (E&E, p.229)

The chapter also addresses the distinction between hypothesis testing and model selection as paramount —a point I subscribed to for a long while before being convinced of the opposite by Peter Green and Jean-Michel Marin—, but I cannot get to the core of this argument. It seems Sober sees model selection through the predictive performances of the models under comparison, if the above quote is representative of his thesis. (Overall, I find the style of the chapter slightly uneven, as if the fact that some sections are adapted from earlier papers would make for different levels of depth.)

Statistically speaking, this chapter also has a difficulty with the continuity assumption. To make this point more precise, I notice there is a long discussion about reaching the optimum configuration (for polar bear fur length) under the SPD hypothesis, but I think evolution happens in discontinuous moves. The case about the local minimum in Section 3.4 is characteristic of this difficulty as a “valley” on a “fitness curve” that in essence takes three possible values over the three different types of eye designs does not really constitute a bottleneck in the optimisation process. Similarly, the temporal structure of the statistical models in Sections 3.3 and 3.5 is never mentioned, even though it needs to be defined for the tests to take place. The past versus current convergence to stationarity or equilibrium and hence to optimality under the SPD hypothesis is an issue (are we there yet?!) and so is the definition of time in the very simple 2×2 Markov chain example… And given a 2×2 contingency table like

$\begin{matrix} &\text{fixed} &\text{polymorphic}\\ \text{synonymous} &17 &42 \\ \text{nonsynonymous} &7 &2\\ \end{matrix}$

testing for independence between both factors is a standard among the standards: I thus fail to understand the lengthy and inconclusive discussion of pp.240-243.