Archive for history of statistics

Bayes-250

Posted in Books, pictures, Statistics, University life with tags , , , on December 18, 2013 by xi'an

BayesMy fourth Bayes-250 and presumably the last one, as it starts sounding like groundhog day!

Stephen Stigler started the day with three facts or items of inference on Thomas Bayes: the first one was about The Essay and its true title, a recent research I made use of in Budapest. As reported in his Statistical Science paper, Stigler found an off-print of Bayes’ Essay with an altogether different title: “A Method of Calculating the Exact Probability of All Conclusions founded on Induction”, which sounds much better than the title of the version published in the Proceedings of the Royal Society, “An Essay toward solving a Problem in the Doctrine of Chances”, and appears as part of a larger mathematical construct in answering Hume’s dismissal of miracles… (Dennis Lindley in a personal communication to Stephen acknowledged the importance of the title and regretted “as an atheist” that the theorem was intended for religious usage!)

Stephen then discussed Bayes’s portrait, which (first?) appeared in June 1933 in The American Conservationist. Herein acknowledged as taken from the Wing collection of the Newberry library in Chicago (where Stephen has not yet unearthed the said volume!) My suggestion would be to use a genealogy algorithm to check whether or not paternity cannot be significantly rejected by comparing the two portraits. The more portraits from Bayes’ family, the better.

Steven Fienberg took over for another enjoyable historical talk about the neo-Bayesian revival of the 50s. In connection with his BA paper on the appearance of the term Bayesian. Giving appropriately a large place to Alan Turing. And Jimmy Savage (whose book does not use the term Bayesian). He also played great videos of Howard Raiffa explaining how he became a (closet) Bayesian. And of Jack Good being interviewed by Persi Diaconis. (On a highly personal level, I wonder who in my hotel has named his or her network “Neo Bayesian Revival”!)

In a very unusual format, Adrian Smith and Alan Gelfand ran an exchange around a bottle of Scotch (and a whole amphitheatre), where Adrian recollected his youth at Cambridge and the slow growth of Bayesian statistics in the UK (“a very unorthodox form of inference” in Dennis’ words). I liked very much the way he explained how Dennis Lindley tried to build for statistics the equivalent of the system of axioms Kolmogorov had produced for probability. And even more how Dennis came to the Bayesian side for decision-theoretic reasons. (The end of the exchange was more predictable as being centred on the MCMC revolution.)

Michael Jordan completed the day with a talk oriented much more towards the future. About the growing statistical perspective on document analysis. Document as data indeed. Starting with the bag of words representation. (A side remark was that his paper Latent Dirichlet allocation got more citations than classics like Jim Berger’s 1985 book or Efron’s 1984 book.) The central theme of the talk was that there is much work left to be done to address real problems. Really real problems with computational issues orders of magnitude away from what we can propose today. Michael took linguistics as a final example. Linking with Adrian’s conclusion in that respect.

Bayesian introductions at IXXI

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , on October 28, 2013 by xi'an

Ten days ago I did a lighting-fast visit to Grenoble for a quick introduction to Bayesian notions during a Bayesian day organised by Michael Blum. It was supported by IXXI, Rhône Alpes Complex Systems Institute, a light structure that favors interdisciplinary research to model complex sytems such as biological or social systems, technological networks… This was an opportunity to recycle my Budapest overview from Bayes 250th to Bayes 2.5.0. (As I have changed my email signature initial from X to IX, I further enjoyed the name IXXI!) More seriously, I appreciated (despite the too short time spent there!) the mix of perspectives and disciplines represented in this introduction, from Bayesian networks and causality in computer science and medical expert systems, to neurosciences and the Bayesian theory of mind, to Bayesian population genetics. And hence the mix of audiences. The part about neurosciences and social representations on others’ mind reminded me of the discussion with Pierre Bessières we had a year ago on France Culture. Again, I am quite sorry and apologetic for having missed part of the day and opportunities for discussions, simply because of a tight schedule this week…

Bayes’ notebook

Posted in Books, pictures, Statistics, University life with tags , , , on July 22, 2013 by xi'an

PLoS topic page on ABC

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on June 7, 2012 by xi'an

A few more comments on the specific entry on ABC written by Mikael Sunnåker et al…. The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the “outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution”, I think they are misleading the readers as they forget the “approximative” aspect of this distribution. Further below, I would have used the title “Insufficient summary statistics” rather than “Sufficient summary statistics”, as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on “Choice and sufficiency of summary statistics” should bother with the sufficiency aspects… It seems to me much more relevant to assess the impact on predictive performances.)

Although this is most minor, I would not have made mention of the (rather artificial) “table for interpretation of the strength in values of the Bayes factor (…) originally published by Harold Jeffreys[6] “. I obviously appreciate very much that the authors advertise our warning about the potential lack of validity of an ABC based Bayes factor! I also like the notion of “quality control”, even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section “Pitfalls and remedies” is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about “Prior distribution and parameter ranges”, in that this is not a problem inherent to ABC… (Granted, the authors present this as a “general risks in statistical inference exacerbated in ABC”, which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle when envisioning ABC as a non-parametric method of inference.

At last, it is always possible to criticise the coverage of the historical part, since ABC is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. Now, I would suggest adding in this section links to the relevant softwares like our own DIY-ABC

(Those comments have also been posted on the PLoS Computational Biology wiki.)

the biggest change

Posted in Statistics, University life with tags , , , , , , on September 29, 2011 by xi'an

The current question for the ISBA Bulletin is “What is the biggest and most surprising change in the field of Statistics that you have witnessed, and what do you think will be the next one?” The answer to the second part is easy: I do not know and even if I knew I would be writing papers about it rather than spilling the beans… The answer to the first part is anything but easy. At the most literal level, taking “witnessed” at face value, I have witnessed the “birth” of Markov chain Monte Carlo methods at the conference organised in Sherbrooke by Jean-Francois Angers in June 1989… (This was already reported in our Short history of MCMC with George Casella.) I clearly remember Adrian showing the audience a slide with about ten lines of Fortran code that corresponded to the Gibbs sampler for a Bayesian analysis of a mixed effect linear model (later to be analysed in JASA). This was so shockingly simple… It certainly was the talk that had the most impact on my whole career, even though (a) I would have certainly learned about MCMC quickly enough had I missed the Sherbrooke conference and (b) there were other talks in my academic life that also induced that “wow” moment, for sure. At a less literal level, the biggest chance if not the most surprising is that the field has become huge, multifaceted, and ubiquitous. When I started studying statistics, it was certainly far from being the sexiest possible field! (At least in the general public) And the job offers were not as numerous and diverse as they are today. (The same is true for Bayesian statistics, of course. Even though it has sounded sexy from the start!)

Follow

Get every new post delivered to your Inbox.

Join 598 other followers