Archive for history of statistics

Bayes-250

Posted in Books, pictures, Statistics, University life with tags , , , on December 18, 2013 by xi'an

BayesMy fourth Bayes-250 and presumably the last one, as it starts sounding like groundhog day!

Stephen Stigler started the day with three facts or items of inference on Thomas Bayes: the first one was about The Essay and its true title, a recent research I made use of in Budapest. As reported in his Statistical Science paper, Stigler found an off-print of Bayes’ Essay with an altogether different title: “A Method of Calculating the Exact Probability of All Conclusions founded on Induction”, which sounds much better than the title of the version published in the Proceedings of the Royal Society, “An Essay toward solving a Problem in the Doctrine of Chances”, and appears as part of a larger mathematical construct in answering Hume’s dismissal of miracles… (Dennis Lindley in a personal communication to Stephen acknowledged the importance of the title and regretted “as an atheist” that the theorem was intended for religious usage!)

Stephen then discussed Bayes’s portrait, which (first?) appeared in June 1933 in The American Conservationist. Herein acknowledged as taken from the Wing collection of the Newberry library in Chicago (where Stephen has not yet unearthed the said volume!) My suggestion would be to use a genealogy algorithm to check whether or not paternity cannot be significantly rejected by comparing the two portraits. The more portraits from Bayes’ family, the better.

Steven Fienberg took over for another enjoyable historical talk about the neo-Bayesian revival of the 50s. In connection with his BA paper on the appearance of the term Bayesian. Giving appropriately a large place to Alan Turing. And Jimmy Savage (whose book does not use the term Bayesian). He also played great videos of Howard Raiffa explaining how he became a (closet) Bayesian. And of Jack Good being interviewed by Persi Diaconis. (On a highly personal level, I wonder who in my hotel has named his or her network “Neo Bayesian Revival”!)

In a very unusual format, Adrian Smith and Alan Gelfand ran an exchange around a bottle of Scotch (and a whole amphitheatre), where Adrian recollected his youth at Cambridge and the slow growth of Bayesian statistics in the UK (“a very unorthodox form of inference” in Dennis’ words). I liked very much the way he explained how Dennis Lindley tried to build for statistics the equivalent of the system of axioms Kolmogorov had produced for probability. And even more how Dennis came to the Bayesian side for decision-theoretic reasons. (The end of the exchange was more predictable as being centred on the MCMC revolution.)

Michael Jordan completed the day with a talk oriented much more towards the future. About the growing statistical perspective on document analysis. Document as data indeed. Starting with the bag of words representation. (A side remark was that his paper Latent Dirichlet allocation got more citations than classics like Jim Berger’s 1985 book or Efron’s 1984 book.) The central theme of the talk was that there is much work left to be done to address real problems. Really real problems with computational issues orders of magnitude away from what we can propose today. Michael took linguistics as a final example. Linking with Adrian’s conclusion in that respect.

Bayesian introductions at IXXI

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , on October 28, 2013 by xi'an

Ten days ago I did a lighting-fast visit to Grenoble for a quick introduction to Bayesian notions during a Bayesian day organised by Michael Blum. It was supported by IXXI, Rhône Alpes Complex Systems Institute, a light structure that favors interdisciplinary research to model complex sytems such as biological or social systems, technological networks… This was an opportunity to recycle my Budapest overview from Bayes 250th to Bayes 2.5.0. (As I have changed my email signature initial from X to IX, I further enjoyed the name IXXI!) More seriously, I appreciated (despite the too short time spent there!) the mix of perspectives and disciplines represented in this introduction, from Bayesian networks and causality in computer science and medical expert systems, to neurosciences and the Bayesian theory of mind, to Bayesian population genetics. And hence the mix of audiences. The part about neurosciences and social representations on others’ mind reminded me of the discussion with Pierre Bessières we had a year ago on France Culture. Again, I am quite sorry and apologetic for having missed part of the day and opportunities for discussions, simply because of a tight schedule this week…

Bayes’ notebook

Posted in Books, pictures, Statistics, University life with tags , , , on July 22, 2013 by xi'an

PLoS topic page on ABC

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on June 7, 2012 by xi'an

A few more comments on the specific entry on ABC written by Mikael Sunnåker et al…. The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the “outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution”, I think they are misleading the readers as they forget the “approximative” aspect of this distribution. Further below, I would have used the title “Insufficient summary statistics” rather than “Sufficient summary statistics”, as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on “Choice and sufficiency of summary statistics” should bother with the sufficiency aspects… It seems to me much more relevant to assess the impact on predictive performances.)

Although this is most minor, I would not have made mention of the (rather artificial) “table for interpretation of the strength in values of the Bayes factor (…) originally published by Harold Jeffreys[6] “. I obviously appreciate very much that the authors advertise our warning about the potential lack of validity of an ABC based Bayes factor! I also like the notion of “quality control”, even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section “Pitfalls and remedies” is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about “Prior distribution and parameter ranges”, in that this is not a problem inherent to ABC… (Granted, the authors present this as a “general risks in statistical inference exacerbated in ABC”, which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle when envisioning ABC as a non-parametric method of inference.

At last, it is always possible to criticise the coverage of the historical part, since ABC is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. Now, I would suggest adding in this section links to the relevant softwares like our own DIY-ABC

(Those comments have also been posted on the PLoS Computational Biology wiki.)

the biggest change

Posted in Statistics, University life with tags , , , , , , on September 29, 2011 by xi'an

The current question for the ISBA Bulletin is “What is the biggest and most surprising change in the field of Statistics that you have witnessed, and what do you think will be the next one?” The answer to the second part is easy: I do not know and even if I knew I would be writing papers about it rather than spilling the beans… The answer to the first part is anything but easy. At the most literal level, taking “witnessed” at face value, I have witnessed the “birth” of Markov chain Monte Carlo methods at the conference organised in Sherbrooke by Jean-Francois Angers in June 1989… (This was already reported in our Short history of MCMC with George Casella.) I clearly remember Adrian showing the audience a slide with about ten lines of Fortran code that corresponded to the Gibbs sampler for a Bayesian analysis of a mixed effect linear model (later to be analysed in JASA). This was so shockingly simple… It certainly was the talk that had the most impact on my whole career, even though (a) I would have certainly learned about MCMC quickly enough had I missed the Sherbrooke conference and (b) there were other talks in my academic life that also induced that “wow” moment, for sure. At a less literal level, the biggest chance if not the most surprising is that the field has become huge, multifaceted, and ubiquitous. When I started studying statistics, it was certainly far from being the sexiest possible field! (At least in the general public) And the job offers were not as numerous and diverse as they are today. (The same is true for Bayesian statistics, of course. Even though it has sounded sexy from the start!)

Handbook of Markov chain Monte Carlo

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , , , on September 22, 2011 by xi'an

At JSM, John Kimmel gave me a copy of the Handbook of Markov chain Monte Carlo, as I had not (yet?!) received it. This handbook is edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, all first-class jedis of the MCMC galaxy. I had not had a chance to get a look at the book until now as Jean-Michel Marin took it home for me from Miami, but, as he remarked in giving it back to me last week, the outcome truly is excellent! Of course, authors and editors being friends of mine, the reader may worry about the objectivity of this assessment; however the quality of the contents is clearly there and the book appears as a worthy successor to the tremendous Markov chain Monte Carlo in Practice by Wally Gilks, Sylvia Richardson and David Spiegelhalter. (I can attest to the involvement of the editors from the many rounds of reviews we exchanged about our MCMC history chapter!) The style of the chapters is rather homogeneous and there are a few R codes here and there. So, while I will still stick to our Monte Carlo Statistical Methods book for teaching MCMC to my graduate students next month, I think the book can well be used at a teaching level as well as a reference on the state-of-the-art MCMC technology. Continue reading

the theory that would not die…

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on September 19, 2011 by xi'an

A few days ago, I had lunch with Sharon McGrayne in a Parisian café and we had a wonderful chat about the people she had met during the preparation of her book, the theory that would not die. Among others, she mentioned the considerable support provided by Dennis Lindley, Persi Diaconis, and Bernard Bru. She also told me about a few unsavoury characters who simply refused to talk to her about the struggles and rise of Bayesian statistics. Then, once I had biked home, her book had at last arrived in my mailbox! How timely! (Actually, getting the book before would have been better, as I would have been able to ask more specific questions. But it seems the publisher, Yale University Press, had not forecasted the phenomenal success of the book and thus failed to scale the reprints accordingly!)

 Here is thus my enthusiastic (and obviously biased) reaction to the theory that would not die. It tells the story and the stories of Bayesian statistics and of Bayesians in a most genial and entertaining manner. There may be some who will object to such a personification of science, which should be (much) more than the sum of the characters who contributed to it. However, I will defend the perspective that (Bayesian) statistical science is as much philosophy as it is mathematics and computer-science, thus that the components that led to its current state were contributed by individuals, for whom the path to those components mattered. While the book inevitably starts with the (patchy) story of Thomas Bayes’s life, incl. his passage in Edinburgh, and a nice non-mathematical description of his ball experiment, the next chapter is about “the man who did everything”, …, yes indeed, Pierre-Simon (de) Laplace himself! (An additional nice touch is the use of lower case everywhere, instead of an inflation of upper case letters!) How Laplace attacked the issue of astronomical errors is brilliantly depicted, rooting the man within statistics and explaining  why he would soon move to the “probability of causes”. And rediscover plus generalise Bayes’ theorem. That his (rather unpleasant!) thirst for honours and official positions would cause later disrepute on his scientific worth is difficult to fathom, esp. when coming from knowledgeable statisticians like Florence Nightingale David. The next chapter is about the dark ages of [not yet] Bayesian statistics and I particularly liked the links with the French army, discovering there that the great Henri Poincaré testified at Dreyfus’ trial using a Bayesian argument, that Bertillon had completely missed the probabilistic point, and that the military judges were then all aware of Bayes’ theorem, thanks to Bertrand’s probability book being used at École Polytechnique! (The last point actually was less of a surprise, given that I had collected some documents about the involvement of late 19th/early 20th century artillery officers in the development of Bayesian techniques, Edmond Lhostes and Maurice Dumas, in connection with Lyle Broemeling’s Biometrika study.) The description of the fights between Fisher and Bayesians and non-Bayesians alike is as always both entertaining and sad. Sad also is the fact that Jeffreys’ masterpiece got so little recognition at the time. (While I knew about Fisher’s unreasonable stand on smoking, going as far as defending the assumption that “lung cancer might cause smoking”(!), the Bayesian analysis of Jerome Cornfield was unknown to me. And quite fascinating.) The figure of Fisher actually permeates the whole book, as a negative bullying figure preventing further developments of early Bayesian statistics, but also as an ambivalent anti-Bayesian who eventually tried to create his own brand of Bayesian statistics in the format of fiducial statistics…

…and then there was the ghastly de Gaulle.” D. Lindley

The following part of the theory that would not die is about Bayes’ contributions to the war (WWII), at least from the Allied side. Again, I knew most of the facts about Alan Turing and Bletchley Park’s Enigma, however the story is well-told and, as in previous occasions, I cannot but be moved by the waste of such a superb intellect, thanks to the stupidity of governments. The role of Albert Madansky in the assessment of the [lack of] safety of nuclear weapons is also well-described, stressing the inevitability of a Bayesian assessment of a one-time event that had [thankfully] not yet happened. The above quote from Dennis Lindley is the conclusion of his argument on why Bayesian statistics were not called Laplacean; I would think instead that the French post-war attraction for abstract statistics in the wake of Bourbaki did more against this recognition than de Gaulle’s isolationism and ghastliness. The involvement of John Tukey into military research was also a novelty for me, but not so much as his use of Bayesian [small area] methods for NBC election night previsions. (They could not hire José nor Andrew at the time.) The conclusion of Chapter 14 on why Tukey felt the need to distance himself from Bayesianism is quite compelling. Maybe paradoxically, I ended up appreciating Chapter 15 even more for the part about the search for a missing H-bomb near Palomares, Spain, as it exposes the plusses a Bayesian analysis would have brought.

There are many classes of problems where Bayesian analyses are reasonable, mainly classes with which I have little acquaintance.” J. Tukey

When approaching near recent times and to contemporaries,  Sharon McGrayne gives a very detailed coverage of the coming-of-age of Bayesians like Jimmy Savage and Dennis Lindley, as well as the impact of Stein’s paradox (a personal epiphany!), along with the important impact of Howard Raiffa and Robert Schlaifer, both on business schools and on modelling prior beliefs [via conjugate priors]. I did not know anything about their scientific careers, but Applied Statistical Decision Theory is a beautiful book that prefigured both DeGroot‘s and Berger‘s. (As an aside, I was amused by Raiffa using Bayesian techniques for horse betting based on race bettors, as I had vaguely played with the idea during my spare if compulsory time in the French Navy!) Similarly, while I’d read detailed scientific accounts of Frederick Mosteller’s and David Wallace’s superb Federalist Papers study, they were only names to me. Chapter 12 mostly remedied this lack of mine’s.

We are just starting” P. Diaconis

The final part, entitled Eureka!, is about the computer revolution we witnessed in the 1980’s, culminating with the (re)discovery of MCMC methods we covered in our own “history”. Because it contains stories that are closer and closer to today’s time, it inevitably crumbles into shorter and shorter accounts. However, the theory that would not die conveys the essential message that Bayes’ rule had become operational, with its own computer language and objects like graphical models and Bayesian networks that could tackle huge amounts of data and real-time constraints. And used by companies like Microsoft and Google. The final pages mention neurological experiments on how the brain operates in a Bayesian-like way (a direction much followed by neurosciences, as illustrated by Peggy Series’ talk at Bayes-250).

In conclusion, I highly enjoyed reading through the theory that would not die. And I am sure most of my Bayesian colleagues will as well. Being Bayesians, they will compare the contents with their subjective priors about Bayesian history, but will in the end update those profitably. (The most obvious missing part is in my opinion the absence of E.T Jaynes and the MaxEnt community, which would deserve a chapter on its own.) Maybe ISBA could consider supporting a paperback or electronic copy to distribute to all its members! As an insider, I have little idea on how the book would be perceived by the layman: it does not contain any formula apart from [the discrete] Bayes’ rule at some point, so everyone can read through.  The current success of the theory that would not die shows that it reaches much further than academic circles. It may be that the general public does not necessarily grasp the ultimate difference between frequentist and Bayesians, or between Fisherians and Neyman-Pearsonians. However the theory that would not die goes over all the elements that explain these differences. In particular, the parts about single events are quite illuminating on the specificities of the Bayesian approach. I will certainly [more than] recommend it to all of my graduate students (and buy the French version for my mother once it is translated, so that she finally understands why I once gave a talk “Don’t tell my mom I am Bayesian” at ENSAE…!) If there is any doubt from the above, I obviously recommend the book to all Og’s readers!

Follow

Get every new post delivered to your Inbox.

Join 680 other followers