Archive for David Spiegelhalter

Xs Xplain’d

Posted in Statistics with tags , , , , , , , , , , on January 31, 2021 by xi'an

tempDavid Spiegelhalter is starting a column in The Guardian about COVID-19, the first installment being about excess death statistics. Arguing rightly that it is “fairer to look at what has happened to the total number of deaths”, since this is an objective quantity (in countries with trustworthy death statistics). The discussion on how many of the excess deaths can be attributed to the pandemic is somewhat confusing, though, as little can be said with enough confidence, between the positive impact (flu deaths have plummeted, 30% less traffic deaths in France, &tc.) and the negative impact (stress, harsher economic or social conditions, &tc.) A worthy warning: the deficit in “other” deaths during the second wave is partly due to the extra deaths during the first wave, esp. for fragile and elderly persons.

politics coming [too close to] statistics [or the reverse]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on May 9, 2020 by xi'an

On 30 April, David Spiegelhalter wrote an opinion column in The Guardian, Coronavirus deaths: how does Britain compare with other countries?, where he pointed out the difficulty, even “for a bean-counting statistician to count deaths”, as the reported figures are undercounts, and stated that “many feel that excess deaths give a truer picture of the impact of an epidemic“. Which, on the side, I indeed believe is a more objective material, as also reported by INSEE and INED in France.

“…my cold, statistical approach is to wait until the end of the year, and the years after that, when we can count the excess deaths. Until then, this grim contest won’t produce any league tables we can rely on.” D. Spiegelhalter

My understanding of the tribune is that the quick accumulation of raw numbers, even for deaths, and their use in the comparison of procedures and countries is not helping in understanding the impacts of policies and actions-reactions from a week ago. Starting with the delays in reporting death certificates, as again illustrated by the ten day lag in the INSEE reports. And accounting for covariates such as population density, economic and health indicators. (The graph below for instance relies on deaths so far attributed to COVID-19 rather than on excess deaths, while these attributions depend on the country policy and its official statistics capacities.)

“Polite request to PM and others: please stop using my Guardian article to claim we cannot make any international comparisons yet. I refer only to detailed league tables—of course we should now use other countries to try and learn why our numbers are high.” D. Spiegelhalter

However, when on 6 May Boris Johnson used this Guardian article during prime minister’s questions in the UK Parliement, to defuse a question from the Labour leader, Keir Starmer, David Spiegelhalter reacted with the above tweet, which is indeed that even with poor and undercounted data the total number of cases is much worse than predicted by the earlier models and deadlier than in neighbouring countries. Anyway, three other fellow statisticians, Phil Brown, Jim Smith (Warwick), and Henry Wynn, also reacted to David’s tribune by complaining at the lack of statistical modelling behind it and the fatalistic message it carries, advocating for model based decision-making, which would be fine if the data was not so unreliable… or if the proposed models were equipped with uncertainty bumpers accounting for misspecification and erroneous data.

RSS honours recipients for 2020

Posted in Statistics with tags , , , , , , , , , , on March 16, 2020 by xi'an

Just read the news that my friend [and co-author] Arnaud Doucet (Oxford) is the winner of the 2020 Guy Silver Medal award from the Royal Statistical Society. I was also please to learn about David Spiegelhalter‘s Guy Gold medal (I first met David at the fourth Valencia Bayesian meeting in 1991, where he had a poster on the very early stages of BUGS) and Byron Morgan‘s Barnett Award for his indeed remarkable work on statistical ecology and in particular Bayesian capture recapture models. Congrats to all six recipients!

we have never been unable to develop a reliable predictive model

Posted in Statistics with tags , , , , , , , , , , , , , , , on November 10, 2019 by xi'an

An alarming entry in The Guardian about the huge proportion of councils in the UK using machine-learning software to allocate benefits, detect child abuse or claim fraud. And relying blindly on the outcome of such software, despite their well-documented lack of reliability, uncertainty assessments, and warnings. Blindly in the sense that the impact of their (implemented) decision was not even reviewed, even though a portion of the councils does not consider renewing the contracts. With the appalling statement of the CEO of one software company reported in the title. Blaming further the lack of accessibility [for their company] of the data used by the councils for the impossibility [for the company] of providing risk factors and identifying bias, in an unbelievable newspeak inversion… As pointed out by David Spiegelhalter in the article, the openness should go the other way, namely that the algorithms behind the suggestions (read decisions) should be available to understand why these decisions were made. (A whole series of Guardian articles relate to this as well, under the heading “Automating poverty”.)

Statistics and Health Care Fraud & Measuring Crime [ASA book reviews]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , on May 7, 2019 by xi'an

From the recently started ASA books series on statistical reasoning in science and society (of which I already reviewed a sequel to The Lady tasting Tea), a short book, Statistics and Health Care Fraud, I read at the doctor while waiting for my appointment, with no chances of cheating! While making me realise that there is a significant amount of health care fraud in the US, of which I had never though of before (!), with possibly specific statistical features to the problem, besides the use of extreme value theory, I did not find me insight there on the techniques used to detect these frauds, besides the accumulation of Florida and Texas examples. As  such this is a very light introduction to the topic, whose intended audience of choice remains unclear to me. It is stopping short of making a case for statistics and modelling against more machine-learning options. And does not seem to mention false positives… That is, the inevitable occurrence of some doctors or hospitals being above the median costs! (A point I remember David Spiegelhalter making a long while ago, during a memorable French statistical meeting in Pau.) The book also illustrates the use of a free auditing software called Rat-stats for multistage sampling, which apparently does not go beyond selecting claims at random according to their amount. Without learning from past data. (I also wonder if the criminals can reduce the chances of being caught by using this software.)

A second book on the “same” topic!, Measuring Crime, I read, not waiting at the police station, but while flying to Venezia. As indicated by the title, this is about measuring crime, with a lot of emphasis on surveys and census and the potential measurement errors at different levels of surveying or censusing… Again very little on statistical methodology, apart from questioning the data, the mode of surveying, crossing different sources, and establishing the impact of the way questions are stated, but also little on bias and the impact of policing and preventing AIs, as discussed in Weapons of Math Destruction and in some of Kristin Lum’s papers.Except for the almost obligatory reference to Minority Report. The book also concludes on an history chapter centred at Edith Abbott setting the bases for serious crime data collection in the 1920’s.

[And the usual disclaimer applies, namely that this bicephalic review is likely to appear later in CHANCE, in my book reviews column.]