Archive for popular science

selection bias [xkcd]

Posted in Books, Kids, pictures, Statistics with tags , , on May 29, 2022 by xi'an

The CS detective

Posted in Books, Kids, Travel, University life with tags , , , , , , , on October 29, 2016 by xi'an

A few weeks ago, I received a generic email from No Starch Press promoting The CS Detective, and as I had liked their earlier Statistics Done Wrong, I requested a review copy of the book. Which I received in Warwick while I was there, last week. And read over my trip back to Paris. As it is a very quick read.

“The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it.” T. Pratchett

The idea of the book is to introduce some concepts of tree searching algorithms through a detective-cum-magic story, a very shallow story if somewhat à la Terry Pratchett. (While this reference does not appear in the book, there are enough mentions made of turtles to suspect the filiation. Even though it is turtles all the way down. Hence I could not swear Frank Runtime was 100% inspired from Sam Vimes. But it rhymes.) I cannot say I am a bit fan of this approach as the story is an hindrance rather than an help, I do not find it particularly funny or enticing, and I keep wishing for the next concept to appear to end the current chapter and its inane plot. Of course, once the story is set aside, the book contains not that much in terms of search algorithms, because they all are limited to discrete tree structures. Namely, exhaustive, binary, breadth- and depth-first, iterative deepening, best-first, search algorithms, along with the notions of arrays, queues, stacks, and heaps. This fills about 50 pages of technical vignettes found at the end of each chapter…

So I end up wondering at what age this book would appeal to a young reader. Trying to remember from my own experience with summer vacation riddle and puzzle books, I would think the range 10-12 could be most appropriate although mileage will vary. Since the author, Jeremy Kubica, animates the Computational Fairy Tales blog with stories of the same flavour, you may start by tasting and testing this approach to popular science before getting the entire book

numbersense (book review)

Posted in Books, Statistics with tags , , , , , , on August 22, 2013 by xi'an

While I got an advance reader’s copy of numbersense, Kaiser Fung’s latest book, sent to me by the publisher McGraw-Hill, I did not managed to write a review until the book had been out for two months. The title of the book is clear enough about the purpose of the author, but the subtitle “How to use Big Data to your advantage” stresses it even further. And includes the sesame “Big Data”, much more likely to appeal to the general reader than “statistics”…!

“I wouldn’t blame you if you are ready to burn this book, and vow never to talk to the lying statisticians ever again.” (p.4)

So why did it take me such a long while to compose this review?! Besides the break induced by The Accident (I took the book to the hospital but ended up reviewing R for Dummies instead!), I figure I got rather taken aback by the style and intended audience of numbersense, given my earlier reading and enjoying Numbers rule your world. While the book remains of interest for statisticians (and other CHANCE readers!), providing examples to use in the classroom, the statistical connection is all but visible to the casual reader who may well conclude that numbersense is a form of numerical common sense of about fighting innumeracy, rather than modelling uncertainty thru statistical models.

“In analyzing data, there is no way to avoid having theoretical assumptions (…) The world has never run out of theoreticians; in the era of Big Data, the bar of evidence is reset lower, making it tougher to tell right from wrong.” (p.11)

Overall, the intended audience of numbersense seems even further away from statistically savy readers than Numbers rule your world. The book is divided into four sections: social data (Chap. 1 & 2), marketing data (Chap. 3-5), economic data (Chap. 6 & 7), and sport data (Chap. 8). Plus a prologue on the Simpson paradox (in marketing), involving Howard Wainer whose Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies I reviewed a while ago. The first (more marketing than social) chapter is about doctoring admission policies against GPA and LSAT scores (whatever that means!) to improve the ranking of a school. This does not sound such a major numerical issue (once the trick is uncovered) and the chapter meanders too much to my taste. The second chapter goes back to Quetelet‘s impossible average man. Asking the reader to question the role of indices in definitions (like obesity). And mentioning the “significant result” bias in medical journals in passing. As well as causality. As in the previous chapter, I finished it waiting for a conclusion that never came. Chapters 3 and 4 focus on Groupon, Without much of a statistical model (except maybe a second-order Simpson paradox?). Chapter 6 is about how companies like Amazon target their suggestions to customers. Not elaborating on the logit or whatever model is behind, though, and drifting aside on the breach of data secrecy by most of “those” companies.  The economics chapters are more to my liking, presumably because they are more standard, covering the subtleties of unemployment and inflation (official) statistics. They fall into what I call the Gini index branch of statistics. At last, the sport chapter is about fantasy football (FF) and not about Moneyball (even though it has links, obviously). I did not go father than a quick perusal at the chapter as I did not understand (most of) the point of the chapter (or of playing FF). For instance, the conclusion seemed quite distanced from the actual story…

“today’s computers do not understand languages. All they  do is match text: they can tell me whether the words “empirical Bayes model” are found on a specific Web page.” (p. 209)

The epilogue is of a different nature as it describes two examples of the tasks undertaken by Kaiser Fung as a data analyst. A nasty data transfer. And a manual classification of some Google queries. This may be the part of numbersense that I enjoyed the most. Again, let me stress I have no scientific complaint about the book: it just sounds too low-tech’ for my taste. And I find it is not helping readers to go beyond the first level of scepticism about raw and processed data. Because they are not data-analysts.

when the Earth was flat

Posted in Books with tags , , , , on January 22, 2013 by xi'an

I received yet another popular science book to review (for Significance), When the Earth was flat by Graeme Donald. The subtitle is “All the bits of Science we got wrong”, which is both very ambitious (“All”, really?!) and modest (in that most scientific theories are approximations waiting to be invalidated and improved by the next theory). (I wrote this review during my trip in Gainesville, maybe too quickly!)

The themes processed and debunked in this book are wide-ranging. In fact they do not necessarily fall under my definition of science. They often are related to commercial swindles and political agendas loosely based on plainly wrong scientific theories. The book is thus more about the uses of (poor) science than about Science itself. Continue reading

Numbers rule your world

Posted in Books, Statistics with tags , , , , , , , , , , , on February 22, 2010 by xi'an

Andrew Gelman gave me a copy of the recent book Numbers rule your world by Kaiser Fung, along with the comment that it was a nice book but not for us. I spend my “lazy Sunday” morning reading the book at the breakfast table and agree with Andrew on his assessment. (waiting for the  incoming blog review!). Numbers rule your world is unlikely to bring enlightment to professional or academic statisticians, but it provides a nice and soft introduction to the use of statistics in everyday’s life, to the point I would encourage my second and third year students to read it. It covers a few topics that are central to Statistics via ten newspaper-ised stories that make for a very light read, but nonetheless make the point. The themes in Numbers rule your world are

  • variability matters more than average, as illustrated by queuing phenomena;
  • correlation is not causation, but is often good enough to uncover patterns, as illustrated by epidemiology and credit scoring;
  • Simpson’s paradox explains for apparent bias in group differences, as illustrated by SAT score differences between black students and white students;
  • false positives and false negatives have different impacts on the error (here comes Bayes theorem!), depending on population sizes and settings, as illustrated by the (great!) case of cheating athletes and polygraph tests (with a reference to Steve Fienberg‘s work);
  • extreme events may exhibit causes, or not, as illustrated by a cheating lottery case (involving Jeff Rosenthal as the expert, not the cheater!) and a series of air crashes.

The overall tone of Numbers rule your world is pleasant and engaging, at the other end of the stylistic spectrum from Taleb’s Black Swan. Fung’s point is obviously the opposite of Taleb‘s: he is showing the reader how well statistical modelling can explain for apparently paradoxical behaviour. Fung is also adopting a very neutral tone, again a major change from Taleb, maybe being even too positive (no the only mention is made of the current housing crisis in the pages Numbers rule your world dedicates to credit scoring comes in the conclusion, pp. 176-7). Now, in terms of novelty, I cannot judge of the amount of innovation when compared with (numerous) other popular science books on the topic. For instance, I think Jeff Rosenthal’s Struck by Lightning brings a rather deeper perspective, but maybe thus restricts the readership further…

%d bloggers like this: