This 2016 book by Howard Wainer has been sitting (!) on my desk for quite a while and it took a long visit to Warwick to find a free spot to quickly read it and write my impressions. The subtitle is, as shown on the picture, “Distinguishing fact from fiction by learning to think like a data scientist”. With all due respect to the book, which illustrates quite pleasantly the dangers of (pseudo-)data mis- or over- (or eve under-)interpretation, and to the author, who has repeatedly emphasised those points in his books and
tribunes opinion columns, including those in CHANCE, I do not think the book teaches how to think like a data scientist. In that an arbitrary neophyte reader would not manage to handle a realistic data centric situation without deeper training. But this collection of essays, some of which were tribunes, makes for a nice reading nonetheless.
I presume that in this post-truth and alternative facts [dark] era, the notion of truthiness is familiar to most readers! It is often based on a misunderstanding or a misappropriation of data leading to dubious and unfounded conclusions. The book runs through dozens of examples (some of them quite short and mostly appealing to common sense) to show how this happens and to some extent how this can be countered. If not avoided as people will always try to bend, willingly or not, the data to their conclusion.
There are several parts and several themes in Truth or Truthiness, with different degrees of depth and novelty. The more involved part is in my opinion the one about causality, with illustrations in educational testing, psychology, and medical trials. (The illustration about fracking and the resulting impact on Oklahoma earthquakes should not be in the book, except that there exist officials publicly denying the facts. The same remark applies to the testing cheat controversy, which would be laughable had not someone ended up the victim!) The section on graphical representation and data communication is less exciting, presumably because it comes after Tufte’s books and message. I also feel the 1854 cholera map of John Snow is somewhat over-exploited, since he only drew the map after the epidemic declined. The final chapter
Don’t Try this at Home is quite anecdotal and at the same time this may the whole point, namely that in mundane questions thinking like a data scientist is feasible and leads to sometimes surprising conclusions!
“In the past a theory could get by on its beauty; in the modern world, a successful theory has to work for a living.” (p.40)
The book reads quite nicely, as a whole and a collection of pieces, from which class and talk illustrations can be borrowed. I like the “learned” tone of it, with plenty of citations and witticisms, some in Latin, Yiddish and even French. (Even though the later is somewhat inaccurate! Si ça avait pu se produire, ça avait dû se produire [p.152] would have sounded more vernacular in my Gallic opinion!) I thus enjoyed unreservedly Truth or Truthiness, for its rich style and critical message, all the more needed in the current times, and far from comparing it with a bag of potato chips as Andrew Gelman did, I would like to stress its classical tone, in the sense of being immersed in a broad and deep culture that seems to be receding fast.