Archive for wikipedia

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , on December 27, 2021 by xi'an

A recent Nature paper by Jessica Davis et al. (with an assessment by Simon Cauchemez and X from INSERM) reassessed the appearance of COVID in European and American States. Accounting for the massive under-reporting in the early days since there was no testing. The approach is based on a complex dynamic model whose parameters are estimated by an ABC algorithm (the reference being the PLoS article that initiated the ABC Wikipedia page). Results are quite interesting in that the distribution of the entry dates covers a calendar as early as December 2019 in most cases. And a proportion of missed cases as high as 99%.

“As evidence, E, we considered the cumulative number of SARS-CoV-2 cases internationally imported from China up to January 21, 2020″

The model behind remain a classical SLIR model but with a discrete and stochastic dynamical and a geographical compartmentalization based on a Voronoi tessellation centred at airports, commuting intensity and population density. Interventions by local and State authorities are also accounted for. The ABC version is a standard rejection algorithm with distance based on the evidence as quoted above. Which is a form of cdf distance (as in our Wasserstein ABC paper). For the posterior distribution of the IFR,  a second ABC algorithm uses the relative distance between observed and generated deaths (per country). The paper further investigates different introduction sources (countries) before local transmission was established. For instance, China is shown to be the dominant source for the first EU countries impacted by the pandemics such as Italy, UK, Germany, France and Spain. Using a “counterfactual scenario where the surveillance systems of the US states and European countries are imagined to operate at levels able to identify 50% of all imported and locally generated infections”, the authors conclude that

“broadening testing specifications could have considerably slowed the pandemic progression, buying considerable time to prepare mitigation responses.”

The [errors in the] error of truth [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , on August 10, 2021 by xi'an

OUP sent me this book, The error of truth by Steven Osterling, for review. It is a story about the “astonishing” development of quantitative thinking in the past two centuries. Unfortunately, I found it to be one of the worst books I have read on the history of sciences…

To start with the rather obvious part, I find the scholarship behind the book quite shoddy as the author continuously brings in items of historical tidbits to support his overall narrative and sometimes fills gaps on his own. It often feels like the material comes from Wikipedia, despite expressing a critical view of the on-line encyclopedia. The [long] quote below is presumably the most shocking historical blunder, as the terror era marks the climax of the French Revolution, rather than the last fight of the French monarchy. Robespierre was the head of the Jacobins, the most radical revolutionaries at the time, and one of the Assembly members who voted for the execution of Louis XIV, which took place before the Terror. And later started to eliminate his political opponents, until he found himself on the guillotine!

“The monarchy fought back with almost unimaginable savagery. They ordered French troops to carry out a bloody campaign in which many thousands of protesters were killed. Any peasant even remotely suspected of not supporting the government was brutally killed by the soldiers; many were shot at point-blank range. The crackdown’s most intense period was a horrific ten-month Reign of Terror (“la Terreur”) during which the government guillotined untold masses (some estimates are as high as 5,000) of its own citizens as a means to control them. One of the architects of the Reign of Terror was Maximilien Robespierre, a French nobleman and lifelong politician. He explained the government’s slaughter in unbelievable terms, as “justified terror . . . [and] an emanation of virtue” (quoted in Linton 2006). Slowly, however, over the next few years, the people gained control. In the end, many nobles, including King Louis XVI and his wife Marie-Antoinette, were themselves executed by guillotining”

Obviously, this absolute misinterpretation does not matter (very) much for the (hi)story of quantification (and uncertainty assessment), but it demonstrates a lack of expertise of the author. And sap whatever trust one could have in new details he brings to light (life?). As for instance when stating

“Bayes did a lot of his developmental work while tutoring students in local pubs. He was a respected teacher. Taking advantage of his immediate resources (in his circumstance, a billiard table), he taught his theorem to many.”

which does not sound very plausible. I never heard that Bayes had students  or went to pubs or exposed his result to many before its posthumous publication… Or when Voltaire (who died in 1778) is considered as seventeenth-century precursor of the Enlightenment. Or when John Graunt, true member of the Royal Society, is given as a member of the Académie des Sciences. Or when Quetelet is presented as French and as a student of Laplace.

The maths explanations are also puzzling, from the law of large numbers illustrated by six observations, and wrongly expressed (p.54) as

$\bar{X}_n+\mu\qquad\text{when}\qquad n\longrightarrow\infty$

to  the Saint-Petersbourg paradox being seen as inverse probability, to a botched description of the central limit theorem  (p.59), including the meaningless equation (p.60)

$\gamma_n=\frac{2^{2n}}{\pi}\int_0^\pi~\cos^{2n} t\,\text dt$

to de Moivre‘s theorem being given as Taylor’s expansion

$f(z)=\sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!}(z-a)^2$

and as his derivation of the concept of variance, to another botched depiction of the difference between Bayesian and frequentist statistics, incl. the usual horror

$P(68.5<70<71.5)=95%$

to independence being presented as a non-linear relation (p.111), to the conspicuous absence of Pythagoras in the regression chapter, to attributing to Gauss the concept of a probability density (when Simpson, Bayes, Laplace used it as well), to another highly confusing verbal explanation of densities, including a potential confusion between different representations of a distribution (Fig. 9.6) and the existence of distributions other than the Gaussian distribution, to another error in writing the Gaussian pdf (p.157),

$f(x)=\dfrac{e^{-(z-\mu)^2}\big/2\sigma^2}{\sigma\sqrt{2\pi}}$

to yet another error in the item response probability (p.301), and.. to completely missing the distinction between the map and the territory, i.e., the probabilistic model and the real world (“Truth”), which may be the most important shortcoming of the book.

The style is somewhat heavy, with many repetitions about the greatness of the characters involved in the story, and some degree of license in bringing them within the narrative of the book. The historical determinism of this narrative is indeed strong, with a tendency to link characters more than they were, and to make them greater than life. Which is a usual drawback of such books, along with the profuse apologies for presenting a few mathematical formulas!

The overall presentation further has a Victorian and conservative flavour in its adoration of great names, an almost exclusive centering on Western Europe, a patriarchal tone (“It was common for them to assist their husbands in some way or another”, p.44; Marie Curie “agreed to the marriage, believing it would help her keep her laboratory position”, p.283), a defense of the empowerment allowed by the Industrial Revolution and of the positive sides of colonialism and of the Western expansion of the USA, including the invention of Coca Cola as a landmark in the march to Progress!, to the fall of the (communist) Eastern Block being attributed to Ronald Reagan, Karol Wojtyła, and Margaret Thatcher, to the Bell Curve being written by respected professors with solid scholarship, if controversial, to missing the Ottoman Enlightenment and being particularly disparaging about the Middle East, to dismissing Galton’s eugenism as a later year misguided enthusiasm (and side-stepping the issue of Pearson’s and Fisher’s eugenic views),

Another recurrent if minor problem is the poor recording of dates and years when introducing an event or a new character. And the quotes referring to the current edition or translation instead of the original year as, e.g., Bernoulli (1954). Or even better!, Bayes and Price (1963).

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]

empirically Bayesian [wISBApedia]

Posted in Statistics with tags , , , , , , , on August 9, 2021 by xi'an

Last week I was pointed out a puzzling entry in the “empirical Bayes” Wikipedia page. The introduction section indeed contains a description of an iterative simulation method that involves an hyperprior $$p(η)$$even though the empirical Bayes perspective does not involve an hyperprior.

While the entry is vague and lacks formulae

These suggest an iterative scheme, qualitatively similar in structure to a Gibbs sampler, to evolve successively improved approximations to $$p(θ$$$$∣$$$$y)$$ and $$p(η∣y)$$. First, calculate an initial approximation to $$p(θ∣y)$$ ignoring the $$η$$ dependence completely; then calculate an approximation to $$p(η$$$$|$$$$y)$$ based upon the initial approximate distribution of $$p(θ$$$$∣$$$$y)$$; then use this $$p(η$$$$∣$$$$y)$$ to update the approximation for $$p(θ$$$$∣$$$$y)$$; then update $$p(η$$$$∣$$$$y)$$; and so on.

it sounds essentially equivalent to a Gibbs sampler, possibly a multiple try Gibbs sampler (unless the author had another notion in mind, alas impossible to guess since no reference is included).

Beyond this specific case, where I think the entire paragraph should be erased from the “empirical Bayes” Wikipedia page, I discussed the general problem of some poor Bayesian entries in Wikipedia with Robin Ryder, who came with the neat idea of running (collective) Wikipedia editing labs at ISBA conferences. If we could further give an ISBA label to these entries, as a certificate of “Bayesian orthodoxy” (!), it would be terrific!

folded Normals

Posted in Books, Kids, pictures, R, Running, Statistics with tags , , , , , , , , , , , , on February 25, 2021 by xi'an

While having breakfast (after an early morn swim at the vintage La Butte aux Cailles pool, which let me in free!), I noticed a letter to the Editor in the Annals of Applied Statistics, which I was unaware existed. (The concept, not this specific letter!) The point of the letter was to indicate that finding the MLE for the mean and variance of a folded normal distribution was feasible without resorting to the EM algorithm. Since the folded normal distribution is a special case of mixture (with fixed weights), using EM is indeed quite natural, but the author, Iain MacDonald, remarked that an optimiser such as R nlm() could be called instead. The few lines of relevant R code were even included. While this is a correct if minor remark, I am a wee bit surprised at seeing it included in the journal, the more because the authors of the original paper using the EM approach were given the opportunity to respond, noticing EM is much faster than nlm in the cases they tested, and Iain MacDonald had a further rejoinder! The more because the Wikipedia page mentioned the use of optimisers much earlier (and pointed out at the R package Rfast as producing MLEs for the distribution).

your GAN is secretly an energy-based model

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 5, 2021 by xi'an

As I was reading this NeurIPS 2020 paper by Che et al., and trying to make sense of it, I came across a citation to our paper Casella, Robert and Wells (2004) on a generalized accept-reject sampling scheme where the proposal changes at each simulation that sounds surprising if appreciated! But after checking this paper also appears as the first reference on the Wikipedia page for rejection sampling, which makes me wonder if many actually read it. (On the side, we mostly wrote this paper on a drive from Baltimore to Ithaca, after JSM 1999.)

“We provide more evidence that it is beneficial to sample from the energy-based model defined both by the generator and the discriminator instead of from the generator only.”

The paper seems to propose a post-processing of the generator output by a GAN, generating from the mixture of both generator and discriminator, via a (unscented) Langevin algorithm. The core idea is that, if p(.) is the true data generating process, g(.) the estimated generator and d(.) the discriminator, then

p(x) ≈ p⁰(x)∝g(x) exp(d(x))

(The approximation would be exact the discriminator optimal.) The authors work with the latent z’s, in the GAN meaning that generating pseudo-data x from g means taking a deterministic transform of z, x=G(z). When considering the above p⁰, a generation from p⁰ can be seen as accept-reject with acceptance probability proportional to exp[d{G(z)}]. (On the side, Lemma 1 is the standard validation for accept-reject sampling schemes.)

Reading this paper made me realise how much the field had evolved since my previous GAN related read. With directions like Metropolis-Hastings GANs and Wasserstein GANs. (And I noticed a “broader impact” section past the conclusion section about possible misuses with societal consequences, which is a new requirement for NeurIPS publications.)