Archive for history of statistics

an hypothetical chain of transmissions

Posted in Books, Statistics, University life with tags , , , , , , on August 6, 2021 by xi'an


Posted in Books, Statistics with tags , , , , , , , , , , on July 13, 2021 by xi'an

Bayesian sufficiency

Posted in Books, Kids, Statistics with tags , , , , , , , , , on February 12, 2021 by xi'an

“During the past seven decades, an astonishingly large amount of effort and ingenuity has gone into the search fpr resonable answers to this question.” D. Basu

Induced by a vaguely related question on X validated, I re-read Basu’s 1977 great JASA paper on the elimination of nuisance parameters. Besides the limitations of competing definitions of conditional, partial, marginal sufficiency for the parameter of interest,  Basu discusses various notions of Bayesian (partial) sufficiency.

“After a long journey through a forest of confusing ideas and examples, we seem to have lost our way.” D. Basu

Starting with Kolmogorov’s idea (published during WW II) to impose to all marginal posteriors on the parameter of interest θ to only depend on a statistic S(x). But having to hold for all priors cancels the notion as the statistic need be sufficient jointly for θ and σ, as shown by Hájek in the early 1960’s. Following this attempt, Raiffa and Schlaifer then introduced a more restricted class of priors, namely where nuisance and interest are a priori independent. In which case a conditional factorisation theorem is a sufficient (!) condition for this Q-sufficiency.  But not necessary as shown by the N(θ·σ, 1) counter-example (when σ=±1 and θ>0). [When the prior on σ is uniform, the absolute average is Q-sufficient but is this a positive feature?] This choice of prior separation is somewhat perplexing in that it does not hold under reparameterisation.

Basu ends up with three challenges, including the multinomial M(θ·σ,½(1-θ)·(1+σ),½(1+θ)·(1-σ)), with (n¹,n²,n³) as a minimal sufficient statistic. And the joint observation of an Exponential Exp(θ) translated by σ and of an Exponential Exp(σ) translated by -θ, where the prior on σ gets eliminated in the marginal on θ.

why is the likelihood not a pdf?

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on January 4, 2021 by xi'an

The return of an old debate on X validated. Can the likelihood be a pdf?! Even though there exist cases where a [version of the] likelihood function shows such a symmetry between the sufficient statistic and the parameter, as e.g. in the Normal mean model, that they are somewhat exchangeable w.r.t. the same measure, the question is somewhat meaningless for a number of reasons that we can all link to Ronald Fisher:

  1. when defining the likelihood function, Fisher (in his 1912 undergraduate memoir!) warns against integrating it w.r.t. the parameter: “the integration with respect to m is illegitimate and has no definite meaning with respect to inverse probability”. The likelihood is “is a relative probability only, suitable to compare point with point, but incapable of being interpreted as a probability distribution over a region, or of giving any estimate of absolute probability.” And again in 1922: “[the likelihood] is not a differential element, and is incapable of being integrated: it is assigned to a particular point of the range of variation, not to a particular element of it”.
  2. He introduced the term “likelihood” especially to avoid the confusion: “I perceive that the word probability is wrongly used in such a connection: probability is a ratio of frequencies, and about the frequencies of such values we can know nothing whatever (…) I suggest that we may speak without confusion of the likelihood of one value of p being thrice the likelihood of another (…) likelihood is not here used loosely as a synonym of probability, but simply to express the relative frequencies with which such values of the hypothetical quantity p would in fact yield the observed sample”.
  3. Another point he makes repeatedly (both in 1912 and 1922) is the lack of invariance of the probability measure obtained by attaching a dθ to the likelihood function L(θ) and normalising it into a density: while the likelihood “is entirely unchanged by any [one-to-one] transformation”, this definition of a probability distribution is not. Fisher actually distanced himself from a Bayesian “uniform prior” throughout the 1920’s.

which sums up as the urge to never neglect the dominating measure!

racism, discrimination and statistics – examining the history [at the RSS]

Posted in Books, Statistics, University life with tags , , , , , on October 23, 2020 by xi'an

The Royal Statistical Society is holding an on-line round table on “Racism, discrimination and statistics – examining the history” on 30 October, at 4pm UK time. The chair is RSS President Deborah Ashby and the speakers are

  • John Aldrich – chair of the RSS History Section
  • Angela Saini – science journalist
  • Stephen Senn – Fisher Memorial Trust