Archive for predictive analytics

[Nature on] simulations driving the world’s response to COVID-19

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on April 30, 2020 by xi'an

Nature of 02 April 2020 has a special section on simulation methods used to assess and predict the pandemic evolution. Calling for caution as the models used therein, like the standard ODE S(E)IR models, which rely on assumptions on the spread of the data and very rarely on data, especially in the early stages of the pandemic. One epidemiologist is quote stating “We’re building simplified representations of reality” but this is not dire enough, as “simplified” evokes “less precise” rather than “possibly grossly misleading”. (The graph above is unrelated to the Nature cover and appears to me as particularly appalling in mixing different types of data, time-scale, population at risk, discontinuous updates, and essentially returning no information whatsoever.)

“[the model] requires information that can be only loosely estimated at the start of an epidemic, such as the proportion of infected people who die, and the basic reproduction number (…) rough estimates by epidemiologists who tried to piece together the virus’s basic properties from incomplete information in different countries during the pandemic’s early stages. Some parameters, meanwhile, must be entirely assumed.”

The report mentions that the team at Imperial College, which predictions impacted the UK Government decisions, also used an agent-based model, with more variability or stochasticity in individual actions, which require even more assumptions or much more refined, representative, and trustworthy data.

“Unfortunately, during a pandemic it is hard to get data — such as on infection rates — against which to judge a model’s projections.”

Unfortunately, the paper was written in the early days of the rise of cases in the UK, which means predictions were not much opposed to actual numbers of deaths and hospitalisations. The following quote shows how far off they can fall from reality:

“the British response, Ferguson said on 25 March, makes him “reasonably confident” that total deaths in the United Kingdom will be held below 20,000.”

since the total number as of April 29 is above 21,000 24,000 29,750 and showing no sign of quickly slowing down… A quite useful general public article, nonetheless.

we have never been unable to develop a reliable predictive model

Posted in Statistics with tags , , , , , , , , , , , , , , , on November 10, 2019 by xi'an

An alarming entry in The Guardian about the huge proportion of councils in the UK using machine-learning software to allocate benefits, detect child abuse or claim fraud. And relying blindly on the outcome of such software, despite their well-documented lack of reliability, uncertainty assessments, and warnings. Blindly in the sense that the impact of their (implemented) decision was not even reviewed, even though a portion of the councils does not consider renewing the contracts. With the appalling statement of the CEO of one software company reported in the title. Blaming further the lack of accessibility [for their company] of the data used by the councils for the impossibility [for the company] of providing risk factors and identifying bias, in an unbelievable newspeak inversion… As pointed out by David Spiegelhalter in the article, the openness should go the other way, namely that the algorithms behind the suggestions (read decisions) should be available to understand why these decisions were made. (A whole series of Guardian articles relate to this as well, under the heading “Automating poverty”.)