## simulating the pandemic

Posted in Books, Statistics with tags , , , , , , , , , , , on November 28, 2020 by xi'an

Nature of 13 November has a general public article on simulating the COVID pandemic as benefiting from the experience gained by climate-modelling methodology.

“…researchers didn’t appreciate how sensitive CovidSim was to small changes in its inputs, their results overestimated the extent to which a lockdown was likely to reduce deaths…”

The argument is essentially Bayesian, namely rather than using a best guess of the parameters of the model, esp. given the state of the available data (and the worse for March). When I read

“…epidemiologists should stress-test their simulations by running ‘ensemble’ models, in which thousands of versions of the model are run with a range of assumptions and inputs, to provide a spread of scenarios with different probabilities…”

it sounds completely Bayesian. Even though there is no discussion of the prior modelling or of the degree of wrongness of the epidemic model itself. The researchers at UCL who conducted the multiple simulations and the assessment of sensitivity to the 940 various parameters found that 19 of them had a strong impact, mostly

“…the length of the latent period during which an infected person has no symptoms and can’t pass the virus on; the effectiveness of social distancing; and how long after getting infected a person goes into isolation…”

but this outcome is predictable (and interesting). Mentions of Bayesian methods appear at the end of the paper:

“…the uncertainty in CovidSim inputs [uses] Bayesian statistical tools — already common in some epidemiological models of illnesses such as the livestock disease foot-and-mouth.”

and

“Bayesian tools are an improvement, says Tim Palmer, a climate physicist at the University of Oxford, who pioneered the use of ensemble modelling in weather forecasting.”

along with ensemble modelling, which sounds a synonym for Bayesian model averaging… (The April issue on the topic had also Bayesian aspects that were explicitely mentionned.)

## modelling protocol in Nature

Posted in Books, Kids, Statistics, University life with tags , , , , , , , on August 19, 2020 by xi'an

A three-page commentary in a recent issue of Nature is a manifesto for responsible modelling, with among the numerous signatories, Deborah Mayo.  (And Phillip Stark as the only statistician I spotted.) The main theme is that the model is not the real thing, e.g., the map is not the territory. Which as such is hardly debatable. The point of the tribune is that, in the light of the pandemic crisis, a large portion of the general population has discovered that mathematical models were not the truth and that their predictions were to be taken with a few marshes of salt. Either because they were based on faulty or antiquated data, if any. Or because their approximation level was too high to return any reliable figure. A failure to understand the nature of mathematical models reminding me of the 2008 financial crisis and of the bemused question of Liz Windsor and of the muddled response of economists:

“Why did nobody notice it?”

“Your Majesty,” eminent economists replied, “the failure to foresee the timing, extent and severity of the crisis and to head it off, while it had many causes, was principally a failure of the collective imagination of many bright people, both in this country and internationally, to understand the risks to the system as a whole.”

“People got a bit lax … perhaps it is difficult to foresee”

The manifesto calls for open assumptions, sensitivity analysis, uncertainty quantification, wariness of overfitting and structural biases (what is the utility function?), and the inclusion of ignorance acknowledgement as an outcome of the model. Which again sounds completely sound if not necessarily helpful when facing interlocutors asking for point estimates. I also regret that the tribune gives hardly any room to statistics and the model checking tools it had developed, except in mentioning the p-hacking and the false feeling of certainty produced by a p-value. Plus a bizarre mention of a French movement of statactivistes of which I had not heard and which seems connected to a book published in French by three of the signatories.

## March(es) for Science

Posted in Statistics with tags , , , , , , , on April 22, 2017 by xi'an

Today there are around 500 marches for Science organised around the World (incl. on in Kangerlussuaq, Qeqqata, Greenland!). Primarily to protest the unprecedented attacks of trumpism on science, scientific values, and scientists, and not only through budget cuts, agency closures, public data erasures, but also denegation of scientific expertise and data to advance financial and partisan interests against climate, water preservation, minorities rights, women equality, and international relations. Being now at a remote retreat in Northern Wales, I will walk virtually at the Cardiff March for Science.

## early rejection MCMC

Posted in Books, Statistics, University life with tags , , , , , , , , on June 16, 2014 by xi'an

In a (relatively) recent Bayesian Analysis paper on efficient MCMC algorithms for climate models, Antti Solonen, Pirkka Ollinaho, Marko Laine, Heikki Haario, Johanna Tamminen and Heikki Järvinen propose an early rejection scheme to speed up Metropolis-Hastings algorithms. The idea is to consider a posterior distribution (proportional to)

$\pi(\theta|y)= \prod_{k=1}^nL_i(\theta|y)$

such that all terms in the product are less than one and to compare the uniform u in the acceptance step of the Metropolis-Hastings algorithm to

$L_1(\theta'|y)/\pi(\theta|y),$

then, if u is smaller than the ratio, to

$L_1(\theta'|y)L_2(\theta'|y)/\pi(\theta|y),$

and so on, until the new value has been rejected or all terms have been evaluated. The scheme obviously stops earlier than the regular Metropolis-Hastings algorithm, at no significant extra cost when the product above does not factor through a sufficient statistic. Solonen et al.  suggest ordering the terms so that the computationally simpler ones are computed first. The upper bound assumption requires and is equivalent to finding the maximum on each term of the product, though, which may be costly in its own for non-standard distributions. With my students Marco Banterle and Clara Grazian, we actually came upon this paper when preparing our delayed acceptance paper as (a) it belongs to the same category of accelerated MCMC methods (delayed acceptance and early rejection are somehow synonymous!) and (b) it mentions the early prefetching papers of Brockwell (2005) and Strid (2009).

“The acceptance probability in ABC is commonly very low, and many proposals are rejected, and ER can potentially help to detect the rejections sooner.”

In the conclusion, Solonen et al. point out a possible link with ABC but, apart from the general idea of rejecting earlier by looking at a subsample or at a proxy simulation of a summary statistics, which is also the idea at the core of Dennis Prangle’s lazy ABC, there is no obvious impact on a likelihood-free method like ABC.

## simulating Nature

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , on July 25, 2012 by xi'an

This book, Simulating Nature: A Philosophical Study of Computer-Simulation Uncertainties and Their Role in Climate Science and Policy Advice, by Arthur C. Petersen, was sent to me twice by the publisher for reviewing it for CHANCE. As I could not find a nearby “victim” to review the book, I took it with me to Australia and read it by bits and pieces along the trip.

“Models are never perfectly reliable, and we are always faced with ontic uncertainty and epistemic uncertainty, including epistemic uncertainty about ontic uncertainty.” (page 53)

The author, Arthur C. Petersen, was a member of the United Nations’ Intergovernmental Panel on Climate Change (IPCC) and works as chief scientist at the PBL Netherlands Environmental Assessment Agency. He mentions that the first edition of this book, Simulating Nature, has achieved some kind of cult status, while being now out of print,  which is why he wrote this second edition. The book centres on the notion of uncertainty connected with computer simulations in the first part (pages 1-94) and on the same analysis applied to the simulation of climate change, based on the experience of the author, in the second part (pages 95-178). I must warn the reader that, as the second part got too focussed and acronym-filled for my own taste, I did not read it in depth, even though the issues of climate change and of the human role in this change are definitely of interest to me. (Readers of CHANCE must also realise that there is very little connection with Statistics in this book or my review of it!) Note that the final chapter is actually more of a neat summary of the book than a true conclusion, so a reader eager to get an idea about the contents of the book can grasp them through the eight pages of the eighth chapter.

“An example of the latter situation is a zero-dimensional (sic) model that aggregates all surface temperatures into a single zero-dimensional (re-sic) variable of globally averaged surface temperature.” (page 41)

The philosophical questions of interest therein are that a computer simulation of reality is not reproducing reality and that the uncertainty(ies) pertaining to this simulation cannot be assessed in its (their) entirety. (This the inherent meaning of the first quote, epistemic uncertainty relating to our lack of knowledge about the genuine model reproducing Nature or reality…) The author also covers the more practical issue of the interface between scientific reporting and policy making, which reminded me of Christl Donnelly’s talk at the ASC 2012 meeting (about cattle epidemics in England). The book naturally does not bring answers to any of those questions, naturally because a philosophical perspective should consider different sides of the problem, but I find it more interested in typologies and classifications (of types of uncertainties, in crossing those uncertainties with panel attitudes, &tc.) than in the fundamentals of simulation. I am obviously incompetent in the matter, however, as a naïve bystander, it does not seem to me that the book makes any significant progress towards setting epistemological and philosophical foundations for simulation. The part connected with the author’s implication in the IPCC shed more light on the difficulties to operate in committees and panels made of members with heavy political agendas than on the possible assessments of uncertainties within the models adopted by climate scientists…With the same provision as above, the philosophical aspects do not seem very deep: the (obligatory?!) reference to Karl Popper does not bring much to the debate, because what is falsification to simulation? Similarly, Lakatos’ prohibition of “direct[ing] the modus tollens at [the] hard core” (page 40) does not turn into a methodological assessment of simulation praxis.

“I argue that the application of statistical methods is not sufficient for adequately dealing with uncertainty.” (page 18)

“I agree (…) that the theory behind the concepts of random and systematic errors is purely statistical and not related to the locations and other dimensions of uncertainty.” (page 55)

Statistics is mostly absent from the book, apart from the remark that statistical uncertainty (understood as the imprecision induced by a finite amount of data) differs from modelling errors (the model is not reality), which the author considers cannot be handled by statistics (stating that Deborah Mayo‘s theory of statistical error analysis cannot be extended to simulation, see the footnote on page 55). [In other words, this book has no connection with Monte Carlo Statistical Methods! With or without capitals… Except for a mention of `real’ random number generators on—one of many—footnotes on page 35.]  Mention is made of “subjective probabilities” (page 54), presumably meaning a Bayesian perspective. But the distinction between statistical uncertainty and scenario uncertainty which “cannot be adequately described in terms of chances or probabilities” (page 54) misses the Bayesian perspective altogether, as does the following sentence that “specifying a degree of probability or belief [in such uncertainties] is meaningless since the mechanism that leads to the events are not sufficiently known” (page 54).

“Scientists can also give their subjective probability for a claim, representing their estimated chance that the claim is true. Provided that they indicate that their estimate for the probability is subjective, they are then explicitly allowing for the possibility that their probabilistic claim is dependent on expert judgement and may actually turn out to be false.” (page 57)

In conclusion, I fear the book does not bring enough of a conclusion on the philosophical justifications of using a simulation model instead of the actual reality and on the more pragmatic aspects of validating/invalidating a computer model and of correcting its imperfections with regards to data/reality. I am quite conscious that this is an immensely delicate issue and that, were it to be entirely solved, the current level of fight between climate scientists and climatoskeptics would not persist. As illustrated by the “Sound Science debate” (pages 68-70), politicians and policy-makers are very poorly equipped to deal with uncertainty and even less with decision under uncertainty. I however do not buy the (fuzzy and newspeak) concept of “post-normal science” developed in the last part of Chapter 4, where the scientific analysis of a phenomenon is abandoned for decision-making, “not pretend[ing] to be either value-free or ethically neutral” (page 75).