A book that came to me for review in CHANCE and that came completely unannounced is Andris Abakuks’ The Synoptic Problem and Statistics. “Unannounced” in that I had not heard so far of the synoptic problem. This problem is one of ordering and connecting the gospels in the New Testament, more precisely the “synoptic” gospels attributed to Mark, Matthew and Luke, since the fourth canonical gospel of John is considered by experts to be posterior to those three. By considering overlaps between those texts, some statistical inference can be conducted and the book covers (some of?) those statistical analyses for different orderings of ancestry in authorship. My overall reaction after a quick perusal of the book over breakfast (sharing bread and fish, of course!) was to wonder why there was no mention made of a more global if potentially impossible approach via a phylogeny tree considering the three (or more) gospels as current observations and tracing their unknown ancestry back just as in population genetics. Not because ABC could then be brought into the picture. Rather because it sounds to me (and to my complete lack of expertise in this field!) more realistic to postulate that those gospels were not written by a single person. Or at a single period in time. But rather that they evolve like genetic mutations across copies and transmission until they got a sort of official status.
“Given the notorious intractability of the synoptic problem and the number of different models that are still being advocated, none of them without its deficiencies in explaining the relationships between the synoptic gospels, it should not be surprising that we are unable to come up with more definitive conclusions.” (p.181)
The book by Abakuks goes instead through several modelling directions, from logistic regression using variable length Markov chains [to predict agreement between two of the three texts by regressing on earlier agreement] to hidden Markov models [representing, e.g., Matthew’s use of Mark], to various independence tests on contingency tables, sometimes bringing into the model an extra source denoted by Q. Including some R code for hidden Markov models. Once again, from my outsider viewpoint, this fragmented approach to the problem sounds problematic and inconclusive. And rather verbose in extensive discussions of descriptive statistics. Not that I was expecting a sudden Monty Python-like ray of light and booming voice to disclose the truth! Or that I crave for more p-values (some may be found hiding within the book). But I still wonder about the phylogeny… Especially since phylogenies are used in text authentication as pointed out to me by Robin Ryder for Chauncer’s Canterbury Tales.