Archive for epidemiology

Sousaphonic graph!

Posted in Books, pictures, Statistics with tags , , , , , , , , , , on January 17, 2022 by xi'an

the limits of R

Posted in Books, pictures, R, Statistics with tags , , , , , , , , , , , , on August 10, 2020 by xi'an

It has been repeated many times on many platforms, the R (or R⁰) number is not a great summary about the COVID-19 pandemic, see eg Rossman’s warning in The Conversation, but Nature chose to stress it one more time (in its 16 Jul edition). Or twice when considering a similar piece in Nature Physics. As Boris Johnson made it a central tool of his governmental communication policy. And some mayors started asking for their own local R numbers! It is obviously tempting to turn the messy and complex reality of this planetary crisis into a single number and even a single indicator R<1, but it is unhelpful and worse, from the epidemiology models being wrong (or at least oversimplifying) to the data being wrong (i.e., incomplete, biased and late), to the predictions being wrong (except for predicting the past). Nothing outrageous from the said Nature article, pointing out diverse degrees of uncertainty and variability and stressing the need to immediately address clusters rather than using the dummy R. As an aside, the repeated use of nowcasting instead of forecasting sounds like a perfect journalist fad, given that it does not seem to be based on a different model of infection or on a different statistical technique. (There is a nowcasting package in R, though!) And a wee bit later I have been pointed out at an extended discussion of an R estimation paper on Radford Neal’s blog.

[Nature on] simulations driving the world’s response to COVID-19

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on April 30, 2020 by xi'an

Nature of 02 April 2020 has a special section on simulation methods used to assess and predict the pandemic evolution. Calling for caution as the models used therein, like the standard ODE S(E)IR models, which rely on assumptions on the spread of the data and very rarely on data, especially in the early stages of the pandemic. One epidemiologist is quote stating “We’re building simplified representations of reality” but this is not dire enough, as “simplified” evokes “less precise” rather than “possibly grossly misleading”. (The graph above is unrelated to the Nature cover and appears to me as particularly appalling in mixing different types of data, time-scale, population at risk, discontinuous updates, and essentially returning no information whatsoever.)

“[the model] requires information that can be only loosely estimated at the start of an epidemic, such as the proportion of infected people who die, and the basic reproduction number (…) rough estimates by epidemiologists who tried to piece together the virus’s basic properties from incomplete information in different countries during the pandemic’s early stages. Some parameters, meanwhile, must be entirely assumed.”

The report mentions that the team at Imperial College, which predictions impacted the UK Government decisions, also used an agent-based model, with more variability or stochasticity in individual actions, which require even more assumptions or much more refined, representative, and trustworthy data.

“Unfortunately, during a pandemic it is hard to get data — such as on infection rates — against which to judge a model’s projections.”

Unfortunately, the paper was written in the early days of the rise of cases in the UK, which means predictions were not much opposed to actual numbers of deaths and hospitalisations. The following quote shows how far off they can fall from reality:

“the British response, Ferguson said on 25 March, makes him “reasonably confident” that total deaths in the United Kingdom will be held below 20,000.”

since the total number as of April 29 is above 21,000 24,000 29,750 and showing no sign of quickly slowing down… A quite useful general public article, nonetheless.

Monsieur le Président [reposted]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on April 11, 2020 by xi'an

Let us carry out screening campaigns on representative samples of population!

Mr President of the Republic, as you rightly indicated, we are at war and everything must be done to combat the spread of CODIV-19. You had the wisdom to surround yourself with a Scientific Council and an Analysis, Research and Expertise Committee, both competent, and, as you know, applied mathematicians, statisticians have a role to play in this battle. Yes, to predict the evolution of the epidemic, mathematical models are used at different scales. This allows us estimate the number of people infected in the coming weeks and months. We are at war and these predictions are essential to the development of the best control strategy. They inform political decisions. This is especially with the help of these items of information that the confinement of the French population has been decided and renewed.

Mr President we are at war and these predictions must be the most robust possible. The more precise they are, the better the decisions they will guide. Mathematical models include a number of unknown parameters whose values ​​should be set based on expert advice or data. These include the transmission rate, incubation time, contagion time, and, of course, to initialize dynamic mathematical models, the number of covered individuals. To enjoy more reliable predictions, it is necessary to better estimate such crucial quantities. The proportion of healthy carriers appears to be a particularly critical parameter.

Mr President, we are at war and we must assess the proportions of healthy carriers by geographic areas. We do not currently have the means to implement massive screenings, but we can carry out surveys. This means, for a well-defined geographic area, to run biological tests on samples of individuals that are drawn at random and are representative of the total population of the area. Such data would come to supplement those already available and would considerably reduce the uncertainty in model predictions.

Mr. President, we are at war, let us give ourselves the means to fight effectively against this scourge. Thanks to a significant effort, the number of individuals that can be tested daily increases significantly, let’s devote some of these available tests to samples representative. For each individual drawn at random, we will perform a nasal swab, a blood test, let us collect clinical data and other items of information on its follow-up barriers. This would provide important information on the percentage of immunized French people. This data would open the possibility to feed mathematical models wisely, and hence to make informed decisions about the different strategies of deconfinement.

Mr. President, we are at war. This strategy, which could at first be deployed only in the most affected sectors, is, we believe, essential. It is doable: designing the survey and determining a representative sample is not an issue, going to the homes of the people in the sample, towards taking samples and having them fill out a questionnaire is also perfectly achievable if we give ourselves the means to do so. You only have to decide that a few of the available PCR tests and serological tests will be devoted to these statistical studies. In Paris and in the Grand Est, for instance, a mere few thousand tests on a representative population of individuals properly selected could better assess the situation and help in taking informed decisions.

Mr. President, a proposal to this effect has been presented to the Scientific Council and to the Analysis, Research and Expertise Committee that you have set up by a group of mathematicians at École Polytechnique with Professor Josselin Garnier at their head. You will realise by reading this tribune that the statistician that I am does support very strongly. I am in no way disputing the competence of the councils which support you but you have to act quickly and, I repeat, only dedicate a few thousand tests to statistics studies. Emergency is everywhere, assistance to the patients, to people in intensive care, must of course be the priority, but let us attempt to anticipate as well . We do not have the means to massively test the entire population, let us run polls.

Jean-Michel Marin
Professeur à l’Université de Montpellier
Président de la Société Française de Statistique
Directeur de l’Institut Montpelliérain Alexander Grothendieck
Vice-Doyen de la Faculté des Sciences de Montpellier

poor statistics

Posted in Books, pictures, R, Statistics, Travel, Wines with tags , , , , , , , , , , , , on September 24, 2019 by xi'an

I came over the weekend across this graph and the associated news that the county of Saint-Nazaire, on the southern border of Brittany, had a significantly higher rate of cancers than the Loire countries. The complete study written by Solenne Delacour, Anne Cowppli-Bony, amd Florence Molinié, is quite cautious about the reasons for this higher rate, even using a Bayesian Poisson-Gamma smoothing (and the R package empbaysmooth), and citing the 1991 paper by Besag, York and Mollié, but the local and national medias are quick to blame the local industries for the difference. The graph above is particularly bad in that it accumulates mortality causes that are not mutually exclusive or independent. For instance, the much higher mortality rate due to alcohol is obviously responsible for higher rates of most other entries. And indicates a sociological pattern that may or may not be due to the type of job in the area, but differs from the more rural other parts of the Loire countries. (Which, like Brittany, are already significantly above (50%) the national reference for alcohol related health issues.), and may not be strongly connected to exposition to chemicals. For instance, the rates of pulmonary cancers are mostly comparable to the national average, if higher than the rest of the Loire countries and connect with a high smoking propensity. Lymphomas are not significantly different from the regional reference. The only type of cancer that can be directly attributed to working conditions are the mesothelioma, mostly caused by asbestos exposure, which was used in ship building, a specialty of the area. Among the many possible reasons for the higher mortality of the county, the study mentions a lower exposure to medical testings (connected with the sociological composition of the area). Which would indicate the most effective policies for lowering these higher cancer and mortality rates.

%d bloggers like this: