Archive for Thomas Bayes


Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , on May 16, 2021 by xi'an

Just heard via ISBA that the Cass Business School of City University London is changing its name to the Bayes Business School “after it was found (!) that some of Sir John Cass’s wealth was obtained through his links to the slave trade.” One of the school buildings is located on Bunhill Row, which leads to Bunhill Fields cemetery where Thomas Bayes (and Richard Price) were buried. And which stands near the Royal Statistical Society building.

“Bayes’ theorem suggests that we get closer to the truth by constantly updating our beliefs in proportion to the weight of new evidence. It is this idea – not only the person – that is the motivation behind adopting this name.”

While it is a notable recognition of the field that Thomas Bayes was selected by [some members of] the City community, I hope the name will not become a registered trademark! And ponder the relevance of naming schools, buildings, prizes, whatever after individuals who should remain mere mortals rather than carrying the larger-than-life burden of representing ideals and values. And the irony of having a business school named after someone who never worked, being financially wealthy by inheritance (from his Sheffield cutler ancestors).  Or of promoting diversity through a religious zealot leaning towards Arianism.

“In Bayes Business School, we believe we now have a name that reflects who we are and the values we hold. Even though Bayes lived a long time ago, his ideas and his name are very much connected to the future rather than the past.”

Fisher, Bayes, and predictive Bayesian inference [seminar]

Posted in Statistics with tags , , , , , , , , , on April 4, 2021 by xi'an

An interesting Foundations of Probability seminar at Rutgers University this Monday, at 4:30ET, 8:30GMT, by Sandy Zabell (the password is Angelina’s birthdate):

R. A. Fisher is usually perceived to have been a staunch critic of the Bayesian approach to statistics, yet his last book (Statistical Methods and Scientific Inference, 1956) is much closer in spirit to the Bayesian approach than the frequentist theories of Neyman and Pearson.  This mismatch between perception and reality is best understood as an evolution in Fisher’s views over the course of his life.  In my talk I will discuss Fisher’s initial and harsh criticism of “inverse probability”, his subsequent advocacy of fiducial inference starting in 1930, and his admiration for Bayes expressed in his 1956 book.  Several of the examples Fisher discusses there are best understood when viewed against the backdrop of earlier controversies and antagonisms.

probability that a vaccinated person is shielded from COVID-19?

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , on March 10, 2021 by xi'an

Over my flight to Montpellier last week, I read an arXival on a Bayesian analysis of the vaccine efficiency. Whose full title is “What is the probability that a vaccinated person is shielded from Covid-19? A Bayesian MCMC based reanalysis of published data with emphasis on what should be reported as `efficacy'”, by Giulio D’Agostini and Alfredo Esposito. In short I was not particularly impressed.

“But the real point we wish to highlight, given the spread of distributions, is that we do not have enough data for drawing sound conclusion.”

The reason for this lack of enthusiasm on my side is that, while the authors’ criticism of an excessive precision in Pfizer, Moderna, or AstraZeneca press releases is appropriate, given the published confidence intervals are not claiming the same precision, a Bayesian reanalysis of the published outcome of their respective vaccine trial outcomes does not show much, simply because there is awfully little data, essentially two to four Binomial-like outcomes. Without further data, the modelling is one of a simple graph of Binomial observations, with two or three probability parameters, which results in a very standard Bayesian analysis that does depend on the modelling choices being made, from a highly unrealistic assumption of homogeneity throughout the population(s) tested for the vaccine(s), to a lack of hyperparameters that could have been shared between vaccinated populations. Parts of the arXival are unrelated and unnecessary, like the highly detailed MCMC algorithm for simulating the posterior (incl. JAGS code) to the reminiscence of Bayes’ and Laplace’s early rendering of inverse probability. (I find both interesting and revealing that arXiv, just like medRxiv, posts a warning on top of COVID related preprints.)

Bayes @ NYT

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , on August 8, 2020 by xi'an

A tribune in the NYT of yesterday on the importance of being Bayesian. When an epidemiologist. Tribune that was forwarded to me by a few friends (and which I missed on my addictive monitoring of the journal!). It is written by , a Canadian journalist writing about mathematics (and obviously statistics). And it brings to the general public the main motivation for adopting a Bayesian approach, namely its coherent handling of uncertainty and its ability to update in the face of new information. (Although it might be noted that other flavours of statistical analysis are also able to update their conclusions when given more data.) The COVID situation is a perfect case study in Bayesianism, in that there are so many levels of uncertainty and imprecision, from the models themselves, to the data, to the outcome of the tests, &tc. The article is journalisty, of course, but it quotes from a range of statisticians and epidemiologists, including Susan Holmes, whom I learned was quarantined 105 days in rural Portugal!, developing a hierarchical Bayes modelling of the prevalent  SEIR model, and David Spiegelhalter, discussing Cromwell’s Law (or better, humility law, for avoiding the reference to a fanatic and tyrannic Puritan who put Ireland to fire and the sword!, and had in fact very little humility for himself). Reading the comments is both hilarious (it does not take long to reach the point when Trump is mentioned, and Taleb’s stance on models and tails makes an appearance) and revealing, as many readers do not understand the meaning of Bayes’ inversion between causes and effects, or even the meaning of Jeffreys’ bar, |, as conditioning.

Monte Carlo Markov chains

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , on May 12, 2020 by xi'an

Darren Wraith pointed out this (currently free access) Springer book by Massimiliano Bonamente [whose family name means good spirit in Italian] to me for its use of the unusual Monte Carlo Markov chain rendering of MCMC.  (Google Trend seems to restrict its use to California!) This is a graduate text for physicists, but one could nonetheless expect more rigour in the processing of the topics. Particularly of the Bayesian topics. Here is a pot-pourri of memorable quotes:

“Two major avenues are available for the assignment of probabilities. One is based on the repetition of the experiments a large number of times under the same conditions, and goes under the name of the frequentist or classical method. The other is based on a more theoretical knowledge of the experiment, but without the experimental requirement, and is referred to as the Bayesian approach.”

“The Bayesian probability is assigned based on a quantitative understanding of the nature of the experiment, and in accord with the Kolmogorov axioms. It is sometimes referred to as empirical probability, in recognition of the fact that sometimes the probability of an event is assigned based upon a practical knowledge of the experiment, although without the classical requirement of repeating the experiment for a large number of times. This method is named after the Rev. Thomas Bayes, who pioneered the development of the theory of probability.”

“The likelihood P(B/A) represents the probability of making the measurement B given that the model A is a correct description of the experiment.”

“…a uniform distribution is normally the logical assumption in the absence of other information.”

“The Gaussian distribution can be considered as a special case of the binomial, when the number of tries is sufficiently large.”

“This clearly does not mean that the Poisson distribution has no variance—in that case, it would not be a random variable!”

“The method of moments therefore returns unbiased estimates for the mean and variance of every distribution in the case of a large number of measurements.”

“The great advantage of the Gibbs sampler is the fact that the acceptance is 100 %, since there is no rejection of candidates for the Markov chain, unlike the case of the Metropolis–Hastings algorithm.”

Let me then point out (or just whine about!) the book using “statistical independence” for plain independence, the use of / rather than Jeffreys’ | for conditioning (and sometimes forgetting \ in some LaTeX formulas), the confusion between events and random variables, esp. when computing the posterior distribution, between models and parameter values, the reliance on discrete probability for continuous settings, as in the Markov chain chapter, confusing density and probability, using Mendel’s pea data without mentioning the unlikely fit to the expected values (or, as put more subtly by Fisher (1936), “the data of most, if not all, of the experiments have been falsified so as to agree closely with Mendel’s expectations”), presenting Fisher’s and Anderson’s Iris data [a motive for rejection when George was JASA editor!] as a “a new classic experiment”, mentioning Pearson but not Lee for the data in the 1903 Biometrika paper “On the laws of inheritance in man” (and woman!), and not accounting for the discrete nature of this data in the linear regression chapter, the three page derivation of the Gaussian distribution from a Taylor expansion of the Binomial pmf obtained by differentiating in the integer argument, spending endless pages on deriving standard properties of classical distributions, this appalling mess of adding over the conditioning atoms with no normalisation in a Poisson experiment

P(X=4|\mu=0,1,2) = \sum_{\mu=0}^2 \frac{\mu^4}{4!}\exp\{-\mu\},

botching the proof of the CLT, which is treated before the Law of Large Numbers, restricting maximum likelihood estimation to the Gaussian and Poisson cases and muddling its meaning by discussing unbiasedness, confusing a drifted Poisson random variable with a drift on its parameter, as well as using the pmf of the Poisson to define an area under the curve (Fig. 5.2), sweeping the improperty of a constant prior under the carpet, defining a null hypothesis as a range of values for a summary statistic, no mention of Bayesian perspectives in the hypothesis testing, model comparison, and regression chapters, having one-dimensional case chapters followed by two-dimensional case chapters, reducing model comparison to the use of the Kolmogorov-Smirnov test, processing bootstrap and jackknife in the Monte Carlo chapter without a mention of importance sampling, stating recurrence results without assuming irreducibility, motivating MCMC by the intractability of the evidence, resorting to the term link to designate the current value of a Markov chain, incorporating the need for a prior distribution in a terrible description of the Metropolis-Hastings algorithm, including a discrete proof for its stationarity, spending many pages on early 1990’s MCMC convergence tests rather than discussing the adaptive scaling of proposal distributions, the inclusion of numerical tables [in a 2017 book] and turning Bayes (1763) into Bayes and Price (1763), or Student (1908) into Gosset (1908).

[Usual disclaimer about potential self-plagiarism: this post or an edited version of it could possibly appear later in my Books Review section in CHANCE. Unlikely, though!]