Archive for history of statistics

down with Galton (and Pearson and Fisher…)

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , on July 22, 2019 by xi'an

In the last issue of Significance, which I read in Warwick prior to the conference, there is a most interesting article on Galton’s eugenics, his heritage at University College London (UCL), and the overall trouble with honouring prominent figures of the past with memorials like named building or lectures… The starting point of this debate is a protest from some UCL students and faculty about UCL having a lecture room named after the late Francis Galton who was a professor there. Who further donated at his death most of his fortune to the university towards creating a professorship in eugenics. The protests are about Galton’s involvement in the eugenics movement of the late 18th and early 19th century. As well as professing racist opinions.

My first reaction after reading about these protests was why not?! Named places or lectures, as well as statues and other memorials, have a limited utility, especially when the named person is long dead and they certainly do not contribute in making a scientific theory [associated with the said individual] more appealing or more valid. And since “humans are [only] humans”, to quote Stephen Stigler speaking in this article, it is unrealistic to expect great scientists to be perfect, the more if one multiplies the codes for ethical or acceptable behaviours across ages and cultures. It is also more rational to use amphitheater MS.02 and lecture room AC.18 rather than associate them with one name chosen out of many alumni’s or former professors’.

Predictably, another reaction of mine was why bother?!, as removing Galton’s name from the items it is attached to is highly unlikely to change current views on eugenism or racism. On the opposite, it seems to detract from opposing the present versions of these ideologies. As some recent proposals linking genes and some form of academic success. Another of my (multiple) reactions was that as stated in the article these views of Galton’s reflected upon the views and prejudices of the time, when the notions of races and inequalities between races (as well as genders and social classes) were almost universally accepted, including in scientific publications like the proceedings of the Royal Society and Nature. When Karl Pearson launched the Annals of Eugenics in 1925 (after he started Biometrika) with the very purpose of establishing a scientific basis for eugenics. (An editorship that Ronald Fisher would later take over, along with his views on the differences between races, believing that “human groups differ profoundly in their innate capacity for intellectual and emotional development”.) Starting from these prejudiced views, Galton set up a scientific and statistical approach to support them, by accumulating data and possibly modifying some of these views. But without much empathy for the consequences, as shown in this terrible quote I found when looking for more material:

“I should feel but little compassion if I saw all the Damaras in the hand of a slave-owner, for they could hardly become more wretched than they are now…”

As it happens, my first exposure to Galton was in my first probability course at ENSAE when a terrific professor was peppering his lectures with historical anecdotes and used to mention Galton’s data-gathering trip to Namibia, literally measure local inhabitants towards his physiognomical views , also reflected in the above attempt of his to superpose photographs to achieve the “ideal” thief…


Posted in Books, pictures, Statistics, University life with tags , , , on December 18, 2013 by xi'an

BayesMy fourth Bayes-250 and presumably the last one, as it starts sounding like groundhog day!

Stephen Stigler started the day with three facts or items of inference on Thomas Bayes: the first one was about The Essay and its true title, a recent research I made use of in Budapest. As reported in his Statistical Science paper, Stigler found an off-print of Bayes’ Essay with an altogether different title: “A Method of Calculating the Exact Probability of All Conclusions founded on Induction”, which sounds much better than the title of the version published in the Proceedings of the Royal Society, “An Essay toward solving a Problem in the Doctrine of Chances”, and appears as part of a larger mathematical construct in answering Hume’s dismissal of miracles… (Dennis Lindley in a personal communication to Stephen acknowledged the importance of the title and regretted “as an atheist” that the theorem was intended for religious usage!)

Stephen then discussed Bayes’s portrait, which (first?) appeared in June 1933 in The American Conservationist. Herein acknowledged as taken from the Wing collection of the Newberry library in Chicago (where Stephen has not yet unearthed the said volume!) My suggestion would be to use a genealogy algorithm to check whether or not paternity cannot be significantly rejected by comparing the two portraits. The more portraits from Bayes’ family, the better.

Steven Fienberg took over for another enjoyable historical talk about the neo-Bayesian revival of the 50s. In connection with his BA paper on the appearance of the term Bayesian. Giving appropriately a large place to Alan Turing. And Jimmy Savage (whose book does not use the term Bayesian). He also played great videos of Howard Raiffa explaining how he became a (closet) Bayesian. And of Jack Good being interviewed by Persi Diaconis. (On a highly personal level, I wonder who in my hotel has named his or her network “Neo Bayesian Revival”!)

In a very unusual format, Adrian Smith and Alan Gelfand ran an exchange around a bottle of Scotch (and a whole amphitheatre), where Adrian recollected his youth at Cambridge and the slow growth of Bayesian statistics in the UK (“a very unorthodox form of inference” in Dennis’ words). I liked very much the way he explained how Dennis Lindley tried to build for statistics the equivalent of the system of axioms Kolmogorov had produced for probability. And even more how Dennis came to the Bayesian side for decision-theoretic reasons. (The end of the exchange was more predictable as being centred on the MCMC revolution.)

Michael Jordan completed the day with a talk oriented much more towards the future. About the growing statistical perspective on document analysis. Document as data indeed. Starting with the bag of words representation. (A side remark was that his paper Latent Dirichlet allocation got more citations than classics like Jim Berger’s 1985 book or Efron’s 1984 book.) The central theme of the talk was that there is much work left to be done to address real problems. Really real problems with computational issues orders of magnitude away from what we can propose today. Michael took linguistics as a final example. Linking with Adrian’s conclusion in that respect.

Bayesian introductions at IXXI

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , on October 28, 2013 by xi'an

Ten days ago I did a lighting-fast visit to Grenoble for a quick introduction to Bayesian notions during a Bayesian day organised by Michael Blum. It was supported by IXXI, Rhône Alpes Complex Systems Institute, a light structure that favors interdisciplinary research to model complex sytems such as biological or social systems, technological networks… This was an opportunity to recycle my Budapest overview from Bayes 250th to Bayes 2.5.0. (As I have changed my email signature initial from X to IX, I further enjoyed the name IXXI!) More seriously, I appreciated (despite the too short time spent there!) the mix of perspectives and disciplines represented in this introduction, from Bayesian networks and causality in computer science and medical expert systems, to neurosciences and the Bayesian theory of mind, to Bayesian population genetics. And hence the mix of audiences. The part about neurosciences and social representations on others’ mind reminded me of the discussion with Pierre Bessières we had a year ago on France Culture. Again, I am quite sorry and apologetic for having missed part of the day and opportunities for discussions, simply because of a tight schedule this week…

Bayes’ notebook

Posted in Books, pictures, Statistics, University life with tags , , , on July 22, 2013 by xi'an

PLoS topic page on ABC

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on June 7, 2012 by xi'an

A few more comments on the specific entry on ABC written by Mikael Sunnåker et al…. The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the “outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution”, I think they are misleading the readers as they forget the “approximative” aspect of this distribution. Further below, I would have used the title “Insufficient summary statistics” rather than “Sufficient summary statistics”, as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on “Choice and sufficiency of summary statistics” should bother with the sufficiency aspects… It seems to me much more relevant to assess the impact on predictive performances.)

Although this is most minor, I would not have made mention of the (rather artificial) “table for interpretation of the strength in values of the Bayes factor (…) originally published by Harold Jeffreys[6] “. I obviously appreciate very much that the authors advertise our warning about the potential lack of validity of an ABC based Bayes factor! I also like the notion of “quality control”, even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section “Pitfalls and remedies” is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about “Prior distribution and parameter ranges”, in that this is not a problem inherent to ABC… (Granted, the authors present this as a “general risks in statistical inference exacerbated in ABC”, which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle when envisioning ABC as a non-parametric method of inference.

At last, it is always possible to criticise the coverage of the historical part, since ABC is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. Now, I would suggest adding in this section links to the relevant softwares like our own DIY-ABC

(Those comments have also been posted on the PLoS Computational Biology wiki.)