Archive for SIR

[Nature on] simulations driving the world’s response to COVID-19

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on April 30, 2020 by xi'an

Nature of 02 April 2020 has a special section on simulation methods used to assess and predict the pandemic evolution. Calling for caution as the models used therein, like the standard ODE S(E)IR models, which rely on assumptions on the spread of the data and very rarely on data, especially in the early stages of the pandemic. One epidemiologist is quote stating “We’re building simplified representations of reality” but this is not dire enough, as “simplified” evokes “less precise” rather than “possibly grossly misleading”. (The graph above is unrelated to the Nature cover and appears to me as particularly appalling in mixing different types of data, time-scale, population at risk, discontinuous updates, and essentially returning no information whatsoever.)

“[the model] requires information that can be only loosely estimated at the start of an epidemic, such as the proportion of infected people who die, and the basic reproduction number (…) rough estimates by epidemiologists who tried to piece together the virus’s basic properties from incomplete information in different countries during the pandemic’s early stages. Some parameters, meanwhile, must be entirely assumed.”

The report mentions that the team at Imperial College, which predictions impacted the UK Government decisions, also used an agent-based model, with more variability or stochasticity in individual actions, which require even more assumptions or much more refined, representative, and trustworthy data.

“Unfortunately, during a pandemic it is hard to get data — such as on infection rates — against which to judge a model’s projections.”

Unfortunately, the paper was written in the early days of the rise of cases in the UK, which means predictions were not much opposed to actual numbers of deaths and hospitalisations. The following quote shows how far off they can fall from reality:

“the British response, Ferguson said on 25 March, makes him “reasonably confident” that total deaths in the United Kingdom will be held below 20,000.”

since the total number as of April 29 is above 21,000 24,000 29,750 and showing no sign of quickly slowing down… A quite useful general public article, nonetheless.

ABC on COVID-19

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , on March 20, 2020 by xi'an

The paper “The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak”, published in Science on 06 March by Matteo Chinazzi and co-authors, considers the impact of travel restriction in Wuhan on the propagation of the virus. (Terrible graph by the way since the overall volume of traffic dropped considerably after the ban.)

“The travel quarantine of Wuhan delayed the overall epidemic progression by only 3 to 5 days in Mainland China, but has a more marked effect at the international scale, where case importations were reduced by nearly 80% until mid February.”

They use a SLIR (susceptible-latent-infectious-removed) pattern of transmission, along with a travel flow network based on 2019 air and ground travel statistics, resorting to ABC for approximating the posterior distribution of the basic reproductive number. It is however unclear to me that the model is particularly accurate at the levels of the transmission pattern (which now seems to occur much earlier than when the symptoms appear) and of the detection rates (which vary greatly from one place to another).

sampling-importance-resampling is not equivalent to exact sampling [triste SIR]

Posted in Books, Kids, Statistics, University life with tags , , , , , , on December 16, 2019 by xi'an

Following an X validated question on the topic, I reassessed a previous impression I had that sampling-importance-resampling (SIR) is equivalent to direct sampling for a given sample size. (As suggested in the above fit between a N(2,½) target and a N(0,1) proposal.)  Indeed, when one produces a sample

x_1,\ldots,x_n \stackrel{\text{i.i.d.}}{\sim} g(x)

and resamples with replacement from this sample using the importance weights

f(x_1)g(x_1)^{-1},\ldots,f(x_n)g(x_n)^{-1}

the resulting sample

y_1,\ldots,y_n

is neither “i.” nor “i.d.” since the resampling step involves a self-normalisation of the weights and hence a global bias in the evaluation of expectations. In particular, if the importance function g is a poor choice for the target f, meaning that the exploration of the whole support is imperfect, if possible (when both supports are equal), a given sample may well fail to reproduce the properties of an iid example ,as shown in the graph below where a Normal density is used for g while f is a Student t⁵ density:

MCMskv #3 [town with a view]

Posted in Statistics with tags , , , , , , , , , , , , , on January 8, 2016 by xi'an

Third day at MCMskv, where I took advantage of the gap left by the elimination of the Tweedie Race [second time in a row!] to complete and submit our mixture paper. Despite the nice weather. The rest of the day was quite busy with David Dunson giving a plenary talk on various approaches to approximate MCMC solutions, with a broad overview of the potential methods and of the need for better solutions. (On a personal basis, great line from David: “five minutes or four minutes?”. It almost beat David’s question on the previous day, about the weight of a finch that sounded suspiciously close to the question about the air-speed velocity of an unladen swallow. I was quite surprised the speaker did not reply with the Arthurian “An African or an European finch?”) In particular, I appreciated the notion that some problems were calling for a reduction in the number of parameters, rather than the number of observations. At which point I wrote down “multiscale approximations required” in my black pad,  a requirement David made a few minutes later. (The talk conditions were also much better than during Michael’s talk, in that the man standing between the screen and myself was David rather than the cameraman! Joke apart, it did not really prevent me from reading them, except for most of the jokes in small prints!)

The first session of the morning involved a talk by Marc Suchard, who used continued fractions to find a closed form likelihood for the SIR epidemiology model (I love continued fractions!), and a talk by Donatello Telesca who studied non-local priors to build a regression tree. While I am somewhat skeptical about non-local testing priors, I found this approach to the construction of a tree quite interesting! In the afternoon, I obviously went to the intractable likelihood session, with talks by Chris Oates on a control variate method for doubly intractable models, Brenda Vo on mixing sequential ABC with Bayesian bootstrap, and Gael Martin on our consistency paper. I was not aware of the Bayesian bootstrap proposal and need to read through the paper, as I fail to see the appeal of the bootstrap part! I later attended a session on exact Monte Carlo methods that was pleasantly homogeneous. With talks by Paul Jenkins (Warwick) on the exact simulation of the Wright-Fisher diffusion, Anthony Lee (Warwick) on designing perfect samplers for chains with atoms, Chang-han Rhee and Sebastian Vollmer on extensions of the Glynn-Rhee debiasing technique I previously discussed on the blog. (Once again, I regretted having to make a choice between the parallel sessions!)

The poster session (after a quick home-made pasta dish with an exceptional Valpolicella!) was almost universally great and with just the right number of posters to go around all of them in the allotted time. With in particular the Breaking News! posters of Giacomo Zanella (Warwick), Beka Steorts and Alexander Terenin. A high quality session that made me regret not touring the previous one due to my own poster presentation.

IMIS & AMIS

Posted in R, Statistics, University life with tags , , , , , , , , on July 30, 2010 by xi'an

A most interesting paper by Adrian Raftery and Le Bao appeared in the Early View section of Biometrics.  It aims at better predictions for HIV prevalence—in the original UNAIDS implementation, a naïve SIR procedure was used, based on the prior as importance function, which sometimes resulted in terrible degeneracy—, but its methodological input is about incremental mixture importance sampling (IMIS), thus relates to the general topic of adaptive Monte Carlo methods I am interested in. (And to some extent to our recent AMIS paper.) Actually, a less elaborate (and less related) version of the IMIS algorithm first appeared in a 2006 paper by Steele, Raftery and Edmond in JCGS in the setting of finite mixture likelihoods and I somehow managed to miss it…

Raftery and Bao propose to replace SIR with an iterative importance sampling technique developed in 2003 by Steele et al. that has some similarities with population Monte Carlo (PMC). (A negligible misrepresentation of PMC in the current paper is that our method does not use “the prior as importance function'”.) In its current format, the IMIS algorithm starts from a first guess (e.g., the prior distribution) and builds a sequence of Gaussian (or Gaussian mixture) approximations whose parameters are estimated from the current population, while all simulation are merged together at each step, using a mixture stabilising weight

\pi(\theta_i^s|x) / \omega_0 p_0(\theta_i^0)+\sum_r \omega_r \hat q_r(\theta_i^s)

where the weights \omega_r depend on the number of simulations at step r. This pattern also appears in our adaptive multiple importance sampling (AMIS) algorithm developed in this arXiv paper with Jean-Marie Cornuet, Jean-Michel Marin and Antonietta Mira, and in the original paper by Owen and Zhou (2000, JASA) that inspired us. Raftery and Bo extend the methodology to an IMIS with optimisation at the initial stage, while AMIS incorporates the natural population Monte Carlo stepwise optimisation developed in Douc et al. (2008, Annals of Statistics) that brings the proposal kernel closer to the target after each iteration. The application of the simulations to conduct model choice found in the current paper and in Steele et al. can also be paralleled with the one using population Monte Carlo we conducted for cosmological data in MNRAS.

Interestingly, Raftery and Bo (and also Steele et al.) refer to the defensive mixture paper of Hesterberg (1995, Technometrics), which has been very influential in my research on importance sampling, and (less directly) to Owen and Zhou (2000, JASA), who did propose the deterministic mixture scheme that inspired AMIS. Besides the foundational papers of Oh and Berger (1991, JASA) and West (1993, J. Royal Statistical Society Series B), they also mention a paper by Raghavan and Cox (1998, J. Statistical Simulation & Computation) I was not aware of, which introduces as well a mixture of importance proposals as a variance stabilising technique.