Archive for foundations

distributions for parameters [seminar]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 22, 2018 by xi'an
Next Thursday, January 25, Nancy Reid will give a seminar in Paris-Dauphine on distributions for parameters that covers different statistical paradigms and bring a new light on the foundations of statistics. (Coffee is at 10am in the Maths department common room and the talk is at 10:15 in room A, second floor.)

Nancy Reid is University Professor of Statistical Sciences and the Canada Research Chair in Statistical Theory and Applications at the University of Toronto and internationally acclaimed statistician, as well as a 2014 Fellow of the Royal Society of Canada. In 2015, she received the Order of Canada, was elected a foreign associate of the National Academy of Sciences in 2016 and has been awarded many other prestigious statistical and science honours, including the Committee of Presidents of Statistical Societies (COPSS) Award in 1992.

Nancy Reid’s research focuses on finding more accurate and efficient methods to deduce and conclude facts from complex data sets to ultimately help scientists find specific solutions to specific problems.

There is currently some renewed interest in developing distributions for parameters, often without relying on prior probability measures. Several approaches have been proposed and discussed in the literature and in a series of “Bayes, fiducial, and frequentist” workshops and meeting sessions. Confidence distributions, generalized fiducial inference, inferential models, belief functions, are some of the terms associated with these approaches.  I will survey some of this work, with particular emphasis on common elements and calibration properties. I will try to situate the discussion in the context of the current explosion of interest in big data and data science. 

The Seven Pillars of Statistical Wisdom [book review]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , on June 10, 2017 by xi'an

I remember quite well attending the ASA Presidential address of Stephen Stigler at JSM 2014, Boston, on the seven pillars of statistical wisdom. In connection with T.E. Lawrence’s 1926 book. Itself in connection with Proverbs IX:1. Unfortunately wrongly translated as seven pillars rather than seven sages.

As pointed out in the Acknowledgements section, the book came prior to the address by several years. I found it immensely enjoyable, first for putting the field in a (historical and) coherent perspective through those seven pillars, second for exposing new facts and curios about the history of statistics, third because of a literary style one would wish to see more often in scholarly texts and of a most pleasant design (and the list of reasons could go on for quite a while, one being the several references to Jorge Luis Borges!). But the main reason is to highlight the unified nature of Statistics and the reasons why it does not constitute a subfield of either Mathematics or Computer Science. In these days where centrifugal forces threaten to split the field into seven or more disciplines, the message is welcome and urgent.

Here are Stephen’s pillars (some comments being already there in the post I wrote after the address):

  1. aggregation, which leads to gain information by throwing away information, aka the sufficiency principle. One (of several) remarkable story in this section is the attempt by Francis Galton, never lacking in imagination, to visualise the average man or woman by superimposing the pictures of several people of a given group. In 1870!
  2. information accumulating at the √n rate, aka precision of statistical estimates, aka CLT confidence [quoting  de Moivre at the core of this discovery]. Another nice story is Newton’s wardenship of the English Mint, with musing about [his] potential exploiting this concentration to cheat the Mint and remain undetected!
  3. likelihood as the right calibration of the amount of information brought by a dataset [including Bayes’ essay as an answer to Hume and Laplace’s tests] and by Fisher in possible the most impressive single-handed advance in our field;
  4. intercomparison [i.e. scaling procedures from variability within the data, sample variation], from Student’s [a.k.a., Gosset‘s] t-test, better understood and advertised by Fisher than by the author, and eventually leading to the bootstrap;
  5. regression [linked with Darwin’s evolution of species, albeit paradoxically, as Darwin claimed to have faith in nothing but the irrelevant Rule of Three, a challenging consequence of this theory being an unobserved increase in trait variability across generations] exposed by Darwin’s cousin Galton [with a detailed and exhilarating entry on the quincunx!] as conditional expectation, hence as a true Bayesian tool, the Bayesian approach being more specifically addressed in (on?) this pillar;
  6. design of experiments [re-enters Fisher, with his revolutionary vision of changing all factors in Latin square designs], with an fascinating insert on the 18th Century French Loterie,  which by 1811, i.e., during the Napoleonic wars, provided 4% of the national budget!;
  7. residuals which again relate to Darwin, Laplace, but also Yule’s first multiple regression (in 1899), Fisher’s introduction of parametric models, and Pearson’s χ² test. Plus Nightingale’s diagrams that never cease to impress me.

The conclusion of the book revisits the seven pillars to ascertain the nature and potential need for an eight pillar.  It is somewhat pessimistic, at least my reading of it was, as it cannot (and presumably does not want to) produce any direction about this new pillar and hence about the capacity of the field of statistics to handle in-coming challenges and competition. With some amount of exaggeration (!) I do hope the analogy of the seven pillars that raises in me the image of the beautiful ruins of a Greek temple atop a Sicilian hill, in the setting sun, with little known about its original purpose, remains a mere analogy and does not extend to predict the future of the field! By its very nature, this wonderful book is about foundations of Statistics and therefore much more set in the past and on past advances than on the present, but those foundations need to move, grow, and be nurtured if the field is not to become a field of ruins, a methodology of the past!

La déraisonnable efficacité des mathématiques

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on May 11, 2017 by xi'an

Although it went completely out of my mind, thanks to a rather heavy travel schedule, I gave last week a short interview about the notion of mathematical models, which got broadcast this week on France Culture, one of the French public radio channels. Within the daily La Méthode Scientifique show, which is a one-hour emission on scientific issues, always a [rare] pleasure to listen to. (Including the day they invited Claire Voisin.) The theme of the show that day was about the unreasonable effectiveness of mathematics, with the [classical] questioning of whether it is an efficient tool towards solving scientific (and inference?) problems because the mathematical objects pre-existed their use or we are (pre-)conditioned to use mathematics to solve problems. I somewhat sounded like a dog in a game of skittles, but it was interesting to listen to the philosopher discussing my relativistic perspective [provided you understand French!]. And I appreciated very much the way Céline Loozen the journalist who interviewed me sorted the chaff from the wheat in the original interview to make me sound mostly coherent! (A coincidence: Jean-Michel Marin got interviewed this morning on France Inter, the major public radio, about the Grothendieck papers.)

machine learning and the future of realism

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on May 4, 2017 by xi'an

Giles and Cliff Hooker arXived a paper last week with this intriguing title. (Giles Hooker is an associate professor of statistics and biology at Cornell U, with an interesting blog on the notion of models, while Cliff Hooker is a professor of philosophy at Newcastle U, Australia.)

“Our conclusion is that simplicity is too complex”

The debate in this short paper is whether or not machine learning relates to a model. Or is it concerned with sheer (“naked”) prediction? And then does it pertain to science any longer?! While it sounds obvious at first, defining why science is more than prediction of effects given causes is much less obvious, although prediction sounds more pragmatic and engineer-like than scientific. (Furthermore, prediction has a somewhat negative flavour in French, being used as a synonym to divination and opposed to prévision.) In more philosophical terms, prediction offers no ontological feature. As for a machine learning structure like a neural network being scientific or a-scientific, its black box nature makes it much more the later than the former, in that it brings no explanation for the connection between input and output, between regressed and regressors. It further lacks the potential for universality of scientific models. For instance, as mentioned in the paper, Newton’s law of gravitation applies to any pair of weighted bodies, while a neural network built on a series of observations could not be assessed or guaranteed outside the domain where those observations are taken. Plus, would miss the simple square law established by Newton. Most fascinating questions, undoubtedly! Putting the stress on models from a totally different perspective from last week at the RSS.

As for machine learning being a challenge to realism, I am none the wiser after reading the paper. Utilising machine learning tools to produce predictions of causes given effects does not seem to modify the structure of the World and very little our understanding of it, since they do not bring explanation per se. What would lead to anti-realism is the adoption of those tools as substitutes for scientific theories and models.

principles or unprincipled?!

Posted in Books, Kids, pictures, Statistics, Travel with tags , , , , , , , on May 2, 2017 by xi'an

A lively and wide-ranging discussion during the Bayes, Fiducial, Frequentist conference was about whether or not we should look for principles. Someone mentioned Terry Speed (2016) claim that it does not help statistics in being principled. Against being efficient. Which gets quite close in my opinion to arguing in favour of a no-U-turn move to machine learning—which requires a significant amount of data to reach this efficiency, as Xiao-Li Meng mentioned—. The debate brought me back to my current running or droning argument on the need to accommodate [more] the difference between models and reality. Not throwing away statistics and models altogether, but developing assessments that are not fully chained to those models. While keeping probabilistic models to handle uncertainty. One pessimistic conclusion I drew from the discussion is that while we [as academic statisticians] may set principles and even teach our students how to run principled and ethical statistical analyses, there is not much we can do about the daily practice of users of statistics…

Fourth Bayesian, Fiducial, and Frequentist Conference

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , on March 29, 2017 by xi'an

Next May 1-3, I will attend the 4th Bayesian, Fiducial and Frequentist Conference at Harvard University (hopefully not under snow at that time of year), which is a meeting between philosophers and statisticians about foundational thinking in statistics and inference under uncertainty. This should be fun! (Registration is now open.)

Le bayésianisme aujourd’hui

Posted in Books, Statistics with tags , , , , , , , on September 19, 2016 by xi'an

A few years ago, I was asked by Isabelle Drouet to contribute a chapter to a multi-disciplinary book on the Bayesian paradigm, book that is now soon to appear. In French. It has this rather ugly title of Bayesianism today. Not that I had hear of Bayesianism or bayésianime previously. There are chapters on the Bayesian notion(s) of probability, game theory, statistics, on applications, and on the (potentially) Bayesian structure of human intelligence. Most of it is thus outside statistics, but I will certainly read through it when I receive my copy.