Archive for official statistics

counting COVID-19 deaths (or not)

Posted in Statistics with tags , , , , , , , , , , , on September 7, 2020 by xi'an

Two COVID-19 articles in the recent issue of Nature relating to data gathering issues. One on the difficulty to distinguish direct COVID deaths from indirect ones from the excess deaths, which “to many scientists, it’s the most robust way to gauge the impact of the pandemic” (which I supported). As indeed the COVID pandemic reduced people access to health care, both because health structures were overwhelmed and because people were scared of catching the virus when visiting these structures. The article [by Giuliana Viglione] supports the direct exploitation of death certificates, to improve the separation, quoting Natalie Dean from the University of Florida in Gainesville. Although this creates a strong lag in the reporting and hence in health policy decisions. (Assuming the overall death reporting is to be trusted, which is not the case for all countries.)

“This long-standing neglect has been exacerbated by the lack of national leadership during the pandemic.”

The other article is about the reasons why the COVID-19 crisis in the US is doubled by a COVID-19 data crisis. Mentioning “political meddling, privacy concerns and years of neglect of public-health surveillance systems” as some of the sources for unreliable data on the pandemic range and evolution. Hardly any contact tracking (as opposed to South Korea or Vietnam), a wealth of local, state and federal structures, data diverted and hence delayed (or worse) to a new system launched by the US Department for Health and Human Services (HHS) for an ill-used $10 million. And data often shared (or lost) by fax! “Lack of leadership,” to state the obvious….

updated (over)mortality curves

Posted in Statistics with tags , , , , , , , on May 26, 2020 by xi'an

politics coming [too close to] statistics [or the reverse]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on May 9, 2020 by xi'an

On 30 April, David Spiegelhalter wrote an opinion column in The Guardian, Coronavirus deaths: how does Britain compare with other countries?, where he pointed out the difficulty, even “for a bean-counting statistician to count deaths”, as the reported figures are undercounts, and stated that “many feel that excess deaths give a truer picture of the impact of an epidemic“. Which, on the side, I indeed believe is a more objective material, as also reported by INSEE and INED in France.

“…my cold, statistical approach is to wait until the end of the year, and the years after that, when we can count the excess deaths. Until then, this grim contest won’t produce any league tables we can rely on.” D. Spiegelhalter

My understanding of the tribune is that the quick accumulation of raw numbers, even for deaths, and their use in the comparison of procedures and countries is not helping in understanding the impacts of policies and actions-reactions from a week ago. Starting with the delays in reporting death certificates, as again illustrated by the ten day lag in the INSEE reports. And accounting for covariates such as population density, economic and health indicators. (The graph below for instance relies on deaths so far attributed to COVID-19 rather than on excess deaths, while these attributions depend on the country policy and its official statistics capacities.)

“Polite request to PM and others: please stop using my Guardian article to claim we cannot make any international comparisons yet. I refer only to detailed league tables—of course we should now use other countries to try and learn why our numbers are high.” D. Spiegelhalter

However, when on 6 May Boris Johnson used this Guardian article during prime minister’s questions in the UK Parliement, to defuse a question from the Labour leader, Keir Starmer, David Spiegelhalter reacted with the above tweet, which is indeed that even with poor and undercounted data the total number of cases is much worse than predicted by the earlier models and deadlier than in neighbouring countries. Anyway, three other fellow statisticians, Phil Brown, Jim Smith (Warwick), and Henry Wynn, also reacted to David’s tribune by complaining at the lack of statistical modelling behind it and the fatalistic message it carries, advocating for model based decision-making, which would be fine if the data was not so unreliable… or if the proposed models were equipped with uncertainty bumpers accounting for misspecification and erroneous data.

the other end of statistics

Posted in Books, pictures, Statistics with tags , , , , , , on February 8, 2017 by xi'an

A coincidence [or not] saw very similar papers appear in Le Monde and The Guardian within days. I already reported on the Doomsday tone of The Guardian tribune. The point of the other paper is essentially the same, namely that the public has lost trust in quantitative arguments, from the explosion of statistical entries in political debates, to the general defiance against experts, media, government, and parties, including the Institute of Official Statistics (INSEE), to a feeling of disconnection between statistical entities and the daily problems of the average citizen, to the lack of guidance and warnings in the publication of such statistics, to the rejection of anything technocratic… With the missing addendum that politicians and governments too readily correlate good figures with their policies and poor ones with their opponents’. (Just no blame for big data analytics in this case.)

the end of statistics [not!]

Posted in Statistics with tags , , , , , , , , , , on January 31, 2017 by xi'an

endofstatsLast week I spotted this tribune in The Guardian, with the witty title of statistics loosing its power, and sort of over-reacted by trying to gather enough momentum from colleagues towards writing a counter-column. After a few days of decantation and a few more readings (reads?) of the tribune, I cooled down towards a more lenient perspective, even though I still dislike the [catastrophic and journalistic] title. The paper is actually mostly right (!), from its historical recap of the evolution of (official) statistics across centuries, to the different nature of the “big data” statistics. (The author is “William Davies, a sociologist and political economist. His books include The Limits of Neoliberalism and The Happiness Industry.”)

“Despite these criticisms, the aspiration to depict a society in its entirety, and to do so in an objective fashion, has meant that various progressive ideals have been attached to statistics.”

A central point is that public opinion has less confidence in (official) statistics than it used to be. (warning: Major understatement, here!) For many reasons, from numbers used to support any argument and its opposite, to statistics (-ians) being associated with experts, found at every corner of news and medias, hence with the “elite” arch-enemy, to a growing innumeracy of both the general public and of the said “elites”—like this “expert” in a debate about the 15th anniversary of the Euro currency on the French NPR last week equating a raise from 2.4 Francs to 6.5 Francs to 700%…—favouring rhetoric over facts, to a disintegration of the social structure that elevates one’s community over others and dismisses arguments from those others, especially those addressed at the entire society. The current debate—and the very fact there can even be a debate about it!—about post-truths and alternative facts is a sad illustration of this regression in the public discourse. The overall perspective in the tribune is one of a sociologist on statistics, but nothing to strongly object to.

“These data analysts are often physicists or mathematicians, whose skills are not developed for the study of society at all.”

The second part of the paper is about the perceived shift from (official) statistics to another and much more dangerous type of data analysis. Which is not a new view on the field, as shown by Weapons of Math Destruction. I tend to disagree with this perception that data handled by private companies for private purposes is inherently evil. The reticence in trusting the conclusions drawn from such datasets also extends to publicly available datasets and is not primarily linked to the lack of reproducibility of such analyses (which would be a perfectly rational argument!). It is neither due to physicists or mathematicians running those, instead of quantitative sociologists! The roots of the mistrust are rather to be found in an anti-scientism that has been growing in the past decades, in a paradox of an equally growing technological society fuelled by scientific advances. Hence, calling for a governmental office of big data or some similar institution is very much unlikely to solve the issue. I do not know what could, actually, but continuing to develop better statistical methodology cannot hurt!