truth or truthiness [book review]

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , , , , on March 21, 2017 by xi'an

This 2016 book by Howard Wainer has been sitting (!) on my desk for quite a while and it took a long visit to Warwick to find a free spot to quickly read it and write my impressions. The subtitle is, as shown on the picture, “Distinguishing fact from fiction by learning to think like a data scientist”. With all due respect to the book, which illustrates quite pleasantly the dangers of (pseudo-)data mis- or over- (or eve under-)interpretation, and to the author, who has repeatedly emphasised those points in his books and tribunes opinion columns, including those in CHANCE, I do not think the book teaches how to think like a data scientist. In that an arbitrary neophyte reader would not manage to handle a realistic data centric situation without deeper training. But this collection of essays, some of which were tribunes, makes for a nice reading  nonetheless.

I presume that in this post-truth and alternative facts [dark] era, the notion of truthiness is familiar to most readers! It is often based on a misunderstanding or a misappropriation of data leading to dubious and unfounded conclusions. The book runs through dozens of examples (some of them quite short and mostly appealing to common sense) to show how this happens and to some extent how this can be countered. If not avoided as people will always try to bend, willingly or not, the data to their conclusion.

There are several parts and several themes in Truth or Truthiness, with different degrees of depth and novelty. The more involved part is in my opinion the one about causality, with illustrations in educational testing, psychology, and medical trials. (The illustration about fracking and the resulting impact on Oklahoma earthquakes should not be in the book, except that there exist officials publicly denying the facts. The same remark applies to the testing cheat controversy, which would be laughable had not someone ended up the victim!) The section on graphical representation and data communication is less exciting, presumably because it comes after Tufte’s books and message. I also feel the 1854 cholera map of John Snow is somewhat over-exploited, since he only drew the map after the epidemic declined.  The final chapter Don’t Try this at Home is quite anecdotal and at the same time this may the whole point, namely that in mundane questions thinking like a data scientist is feasible and leads to sometimes surprising conclusions!

“In the past a theory could get by on its beauty; in the modern world, a successful theory has to work for a living.” (p.40)

The book reads quite nicely, as a whole and a collection of pieces, from which class and talk illustrations can be borrowed. I like the “learned” tone of it, with plenty of citations and witticisms, some in Latin, Yiddish and even French. (Even though the later is somewhat inaccurate! Si ça avait pu se produire, ça avait dû se produire [p.152] would have sounded more vernacular in my Gallic opinion!) I thus enjoyed unreservedly Truth or Truthiness, for its rich style and critical message, all the more needed in the current times, and far from comparing it with a bag of potato chips as Andrew Gelman did, I would like to stress its classical tone, in the sense of being immersed in a broad and deep culture that seems to be receding fast.

stochastic project

Posted in Mountains, pictures, Travel, Wines with tags , , , , , , on March 20, 2017 by xi'an


latest issue of Significance

Posted in Statistics with tags , , , , on March 20, 2017 by xi'an

The latest issue of Significance is bursting with exciting articles and it is a shame I do not receive it any longer (not that I stopped subscribing to the RSS or the ASA, but it simply does not get delivered to my address!). For instance, a tribune by Tom Nicolls (from whom I borrowed this issue for the weekend!) on his recent assessment of false positive in brain imaging [I covered in a blog entry a few months ago] when checking the cluster inference and the returned p-values. And the British equivalent of Gelman et al. book cover on the seasonality of births in England and Wales, albeit witout a processing of the raw data and without mention being made of the Gelmanesque analysis: the only major gap in the frequency is around Christmas and New Year, while there is a big jump around September (also there in the New York data).

birdfeedA neat graph on the visits to four feeders by five species of birds. A strange figure in Perils of Perception that [which?!] French people believe 31% of the population is Muslim and that they are lacking behind many other countries in terms of statistical literacy. And a rather shallow call to Popper to running decision-making in business statistics.

un peu de douceur dans un monde de brutes…

Posted in Kids, pictures with tags , , , , , on March 19, 2017 by xi'an

God save the Queen, the fascist Marine

Posted in Statistics with tags , , , , on March 19, 2017 by xi'an

canardo«Dans quelques semaines, ce pouvoir politique aura été balayé par l’élection. Mais ses fonctionnaires, eux, devront assumer le poids de ces méthodes illégales, car elles sont totalement illégales. Ils mettent en jeu leur propre responsabilité. L’État que nous voulons sera patriote.» 26 Fév. 2017

“Si vous venez dans notre pays, ne vous attendez pas à ce que vous soyez pris en charge, à être soignés, que vos enfants soient éduqués gratuitement, maintenant c’est terminé !” 8 Déc. 2016

“Moi au pouvoir, j’abolirai le mariage pour tous.” 18 Mai 2013

Jonathan Strange & Mr Norrell [BBC One]

Posted in Books, pictures, Travel with tags , , , , , , , on March 18, 2017 by xi'an

After discussing Jonathan Strange & Mr Norrell with David Frazier in Banff, where I spotted him reading this fabulous book, I went for a look at the series BBC One made out of this great novel. And got so hooked to it that I binge-watched the whole series of 7 episodes over three days..! I am utterly impressed at the BBC investing so much into this show, rendering most of the spirit of the book and not only the magical theatrics. The complex [and nasty] personality of Mr Norrell and his petit-bourgeois quest of respectability is beautifully exposed, leading him to lie and steal and come close to murder [directly or by proxy], in a pre-Victorian and anti-Romantic urge to get away from magical things from the past, “more than 300 years ago”. While Jonathan Strange’s own Romantic inclinations are obvious, including the compulsory  travel to Venezia [even though the BBC could only afford Croatia, it seems!] The series actually made clear some points I had missed in the novel, presumably by rushing through it, like the substitution of Strange’s wife by the moss-oak doppelganger created by the fairy king. The enslavement of Stephen,  servant of Lord Pole and once and future king by the same fairy is also superbly rendered.

While not everything in the series is perfect, with in particular the large scale outdoor scenes being too close to a video-game rendering (as in the battle of Waterloo that boils down to a backyard brawl!), the overall quality of the show [the Frenchmen there parlent vraiment français, with no accent!] and adhesion to the spirit of Susanna Clarke’s novel make it an example of the tradition of excellence of the BBC. (I just wonder at the perspective of a newcomer who would watch the series with no prior exposure to the book!)

how large is 9!!!!!!!!!?

Posted in Statistics with tags , , , , , , , , , on March 17, 2017 by xi'an

This may sound like an absurd question [and in some sense it is!], but this came out of a recent mathematical riddle on The Riddler, asking for the largest number one could write with ten symbols. The difficulty with this riddle is the definition of a symbol, as the collection of available symbols is a very relative concept. For instance, if one takes  the symbols available on a basic pocket calculator, besides the 10 digits and the decimal point, there should be the four basic operations plus square root and square,which means that presumably 999999999² is the largest one can  on a cell phone, there are already many more operations, for instance my phone includes the factorial operator and hence 9!!!!!!!!! is a good guess. While moving to a computer the problem becomes somewhat meaningless, both because there are very few software that handle infinite precision computing and hence very large numbers are not achievable without additional coding, and because it very much depends on the environment and on which numbers and symbols are already coded in the local language. As illustrated by this X validated answer, this link from The Riddler, and the xkcd entry below. (The solution provided by The Riddler itself is not particularly relevant as it relies on a particular collection of symbols, which mean Rado’s number BB(9999!) is also a solution within the right referential.)