Archive for Florence Nightingale

AIQ [book review]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , , , on January 11, 2019 by xi'an

AIQ was my Christmas day read, which I mostly read while the rest of the household was still sleeping. The book, written by two Bayesians, Nick Polson and James Scott, was published before the ISBA meeting last year, but I only bought it on my last trip to Warwick [as a Xmas present]. This is a pleasant book to read, especially while drinking tea by the fire!, well-written and full of facts and anecdotes I did not know or had forgotten (more below). Intended for a general audience, it is also quite light, from a technical side, rather obviously, but also from a philosophical side. While strongly positivist about the potential of AIs for the general good, it cannot be seen as an antidote to the doomlike Superintelligence by Nick Bostrom or the more factual Weapons of Maths Destruction by Cathy O’Neal. (Both commented on the ‘Og.)

Indeed, I find the book quite benevolent and maybe a wee bit too rosy in its assessment of AIs and the discussion on how Facebook and Russian intervention may have significantly to turn the White House Orange is missing [imho] the viral nature of the game, when endless loops of highly targeted posts can cut people from the most basic common sense. While the authors are “optimistic that, given the chance, people can be smart enough”, I do reflect on the sheer fact that the hoax that Hillary Clinton was involved in a child sex ring was ever considered seriously by people. To the point of someone shooting at the pizza restaurant. And I hence am much less optimistic at the ability for a large enough portion of the population, not even the majority, to keep a critical distance from the message carried by AI driven media. Similarly, while Nick and James point out (rather late in the book) that big data (meaning large data) is not necessarily good data for being unrepresentative at the population at large, they do not propose (in the book) highly convincing solutions to battle bias in existing and incoming AIs. Leading to a global worry that AIs may do well for a majority of the population and discriminate against a minority by the same reasoning. As described in Cathy O’Neal‘s book, and elsewhere, proprietary software does not even have to explain why it discriminates. More globally, the business school environment of the authors may have prevented them from stating a worry on the massive power grab by the AI-based companies, which genetically grow with little interest in democracy and states, as shown (again) by the recent election or their systematic fiscal optimisation. Or by the massive recourse to machine learning by Chinese authorities towards a social credit system grade for all citizens.

“La rage de vouloir conclure est une des manies les plus funestes et les plus stériles qui appartiennent à l’humanité. Chaque religion et chaque philosophie a prétendu avoir Dieu à elle, toiser l’infini et connaître la recette du bonheur.” Gustave Flaubert

I did not know about Henrietta Leavitt’s prediction rule for pulsating stars, behind Hubble’s discovery, which sounds like an astronomy dual to Rosalind Franklin’s DNA contribution. The use of Bayes’ rule for locating lost vessels is also found in The Theorem that would not die. Although I would have also mentioned its failure in locating Malaysia Airlines Flight 370. I had also never heard the great expression of “model rust. Nor the above quote from Flaubert. It seems I have recently spotted the story on how a 180⁰ switch in perspective on language understanding by machines brought the massive improvement that we witness today. But I cannot remember where. And I have also read about Newton missing the boat on the precision of the coinage accuracy (was it in Bryson’s book on the Royal Society?!), but with less neutral views on the role of Newton in the matter, as the Laplace of England would have benefited from keeping the lax measures of assessment.

Great to see friendly figures like Luke Bornn and Katherine Heller appearing in the pages. Luke for his work on the statistical analysis of basketball games, Katherine  for her work on predictive analytics in medicine. Reflecting on the missed opportunities represented by the accumulation of data on any patient throughout their life that is as grossly ignored nowadays as it was at Nightingale‘s time. The message of the chapter [on “The Lady with the Lamp”] may again be somewhat over-optimistic: while AI and health companies see clear incentives in developing more encompassing prediction and diagnostic techniques, this will only benefit patients who can afford the ensuing care. Which, given the state of health care systems in the most developed countries, is an decreasing proportion. Not to mention the less developed countries.

Overall, a nice read for the general public, de-dramatising the rise of the machines!, and mixing statistics and machine learning to explain the (human) intelligence behind the AIs. Nothing on the technical side, to be sure, but this was not the intention of the authors.

The Seven Pillars of Statistical Wisdom [book review]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , on June 10, 2017 by xi'an

I remember quite well attending the ASA Presidential address of Stephen Stigler at JSM 2014, Boston, on the seven pillars of statistical wisdom. In connection with T.E. Lawrence’s 1926 book. Itself in connection with Proverbs IX:1. Unfortunately wrongly translated as seven pillars rather than seven sages.

As pointed out in the Acknowledgements section, the book came prior to the address by several years. I found it immensely enjoyable, first for putting the field in a (historical and) coherent perspective through those seven pillars, second for exposing new facts and curios about the history of statistics, third because of a literary style one would wish to see more often in scholarly texts and of a most pleasant design (and the list of reasons could go on for quite a while, one being the several references to Jorge Luis Borges!). But the main reason is to highlight the unified nature of Statistics and the reasons why it does not constitute a subfield of either Mathematics or Computer Science. In these days where centrifugal forces threaten to split the field into seven or more disciplines, the message is welcome and urgent.

Here are Stephen’s pillars (some comments being already there in the post I wrote after the address):

  1. aggregation, which leads to gain information by throwing away information, aka the sufficiency principle. One (of several) remarkable story in this section is the attempt by Francis Galton, never lacking in imagination, to visualise the average man or woman by superimposing the pictures of several people of a given group. In 1870!
  2. information accumulating at the √n rate, aka precision of statistical estimates, aka CLT confidence [quoting  de Moivre at the core of this discovery]. Another nice story is Newton’s wardenship of the English Mint, with musing about [his] potential exploiting this concentration to cheat the Mint and remain undetected!
  3. likelihood as the right calibration of the amount of information brought by a dataset [including Bayes’ essay as an answer to Hume and Laplace’s tests] and by Fisher in possible the most impressive single-handed advance in our field;
  4. intercomparison [i.e. scaling procedures from variability within the data, sample variation], from Student’s [a.k.a., Gosset‘s] t-test, better understood and advertised by Fisher than by the author, and eventually leading to the bootstrap;
  5. regression [linked with Darwin’s evolution of species, albeit paradoxically, as Darwin claimed to have faith in nothing but the irrelevant Rule of Three, a challenging consequence of this theory being an unobserved increase in trait variability across generations] exposed by Darwin’s cousin Galton [with a detailed and exhilarating entry on the quincunx!] as conditional expectation, hence as a true Bayesian tool, the Bayesian approach being more specifically addressed in (on?) this pillar;
  6. design of experiments [re-enters Fisher, with his revolutionary vision of changing all factors in Latin square designs], with an fascinating insert on the 18th Century French Loterie,  which by 1811, i.e., during the Napoleonic wars, provided 4% of the national budget!;
  7. residuals which again relate to Darwin, Laplace, but also Yule’s first multiple regression (in 1899), Fisher’s introduction of parametric models, and Pearson’s χ² test. Plus Nightingale’s diagrams that never cease to impress me.

The conclusion of the book revisits the seven pillars to ascertain the nature and potential need for an eight pillar.  It is somewhat pessimistic, at least my reading of it was, as it cannot (and presumably does not want to) produce any direction about this new pillar and hence about the capacity of the field of statistics to handle in-coming challenges and competition. With some amount of exaggeration (!) I do hope the analogy of the seven pillars that raises in me the image of the beautiful ruins of a Greek temple atop a Sicilian hill, in the setting sun, with little known about its original purpose, remains a mere analogy and does not extend to predict the future of the field! By its very nature, this wonderful book is about foundations of Statistics and therefore much more set in the past and on past advances than on the present, but those foundations need to move, grow, and be nurtured if the field is not to become a field of ruins, a methodology of the past!

in the time of cholera

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , on April 6, 2014 by xi'an

Medical illuminations [book review]

Posted in Books, pictures, Statistics with tags , , , , on September 27, 2013 by xi'an

Howard Wainer wrote another book, about to be published by Oxford University Press, called Medical Illuminations. (The book is announced for January 2 on amazon. A great New Year gift to be sure!) While I attended WSC 2013 in Hong Kong and then again at the RSS Annual Conference in Newcastle, I saw a preliminary copy of the book and asked the representative of OUP if I could get a copy for CHANCE (by any chance?!)… And they kindly sent me a copy the next day!

 “This is an odd book (…) gallop[ing] off in all directions at once.” (p.152)

As can be seen from the cover, which reproduces the great da Vinci’s notebook page above (and seen also from the title where illuminations flirts with illuminated [manuscript]), the book focus on visualisation of medical data to “improve healthcare”. Its other themes are using evidence and statistical thinking towards the same goal. Since I was most impressed by the graphical part, I first thought of entitling the post as “Howard does his Tufte (before wondering at the appropriateness of such a title)!

“As hard as this may be to believe, this display is not notably worse than many of the others containd in this remarkable volume.” (p.78)

In fact, this first section is very much related with CHANCE in that a large sequence of graphics were submitted by CHANCE readers when Howard Wainer launched a competition in the magazine for improving upon a Nightingale-like representation by Burtin of antibiotics efficiency. It starts from a administrative ruling that the New York State Health Department had to publish cancer maps overlayed with potentially hazardous sites without any (interpretation) buffer. From there, Wainer shows how the best as well as the worst can be made of graphical representations of statistical data. It reproduces (with due mention) Tufte‘s selection of Minard‘s rendering of the Napoleonic Russian campaign as the best graph ever… The corresponding chapters of the book keep their focus on medical data, with some commentaries on the graphical quality of the 2008 National Healthcare Quality Report (ans.: could do better!). While this is well-done and with a significant message, I would still favour Tufte for teaching data users to present their findings in the most effective way. An interesting final chapter for the section is about “controlling creativity” where Howard Wainer follows in the steps of John Tukey about the Atlas of United States Mortality, And then shows a perfectly incomprehensible chart taken from Understanding USA, a not very premonitory title… (Besides Howard’s conclusion quoted above, you should also read the one-star comments on amazon!)

“Of course, it is impossible to underestimate the graphical skills of the mass media.” (p.164)

Section II is about a better use of statistics and of communicating those statistics towards improving healthcare, from fighting diabetes, to picking the right treatment for hip fractures (from an X-ray),  to re-evaluate detection tests (for breast and prostate cancers) as possibly very inefficient, and to briefly wonder about accelerated testing. And Section III tries to explain why progress (by applying the previous recommendations) has not been more steady. It starts with a story about the use of check-lists in intensive care and the dramatic impact on their effectiveness against infections. (The story hit home as I lost my thumb due to an infection while in intensive care! Maybe a check-list would have helped. Maybe.)  The next chapter contrasts the lack of progress in using check-lists with the adoption of the Korean alphabet in Korea, a wee unrelated example given the focus of the book. (Overall, I find most of the final chapters on the weak side of the book.)

This is indeed an odd book, with a lot of clever remarks and useful insights, but not so much with a driving line that would have made Wainer’s Medical Illuminations more than the sum of its components. Each section and most chapters (!) contain sensible recommendations for improving the presentation and exploitation of medical data towards practitioners and patients. I however wonder how much the book can impact the current state of affairs, like producing better tools for monitoring one’s own diabetes. So, in the end, I recommend the reading of Medical Illuminations as a very pleasant moment, from which examples and anecdotes can be borrowed for courses and friendly discussions. For non-statisticians, it is certainly a worthy entry on the relevance of statistical processing of (raw) data.