Archive for Harold Jeffreys

PLoS topic page on ABC

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on June 7, 2012 by xi'an

A few more comments on the specific entry on ABC written by Mikael Sunnåker et al…. The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the “outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution”, I think they are misleading the readers as they forget the “approximative” aspect of this distribution. Further below, I would have used the title “Insufficient summary statistics” rather than “Sufficient summary statistics”, as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on “Choice and sufficiency of summary statistics” should bother with the sufficiency aspects… It seems to me much more relevant to assess the impact on predictive performances.)

Although this is most minor, I would not have made mention of the (rather artificial) “table for interpretation of the strength in values of the Bayes factor (…) originally published by Harold Jeffreys[6] “. I obviously appreciate very much that the authors advertise our warning about the potential lack of validity of an ABC based Bayes factor! I also like the notion of “quality control”, even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section “Pitfalls and remedies” is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about “Prior distribution and parameter ranges”, in that this is not a problem inherent to ABC… (Granted, the authors present this as a “general risks in statistical inference exacerbated in ABC”, which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle when envisioning ABC as a non-parametric method of inference.

At last, it is always possible to criticise the coverage of the historical part, since ABC is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. Now, I would suggest adding in this section links to the relevant softwares like our own DIY-ABC

(Those comments have also been posted on the PLoS Computational Biology wiki.)

May I believe I am a Bayesian?!

Posted in Books, Statistics, University life with tags , , , , , , , , , on January 21, 2012 by xi'an

…the argument is false that because some ideal form of this approach to reasoning seems excellent n theory it therefore follows that in practice using this and only this approach to reasoning is the right thing to do.” Stephen Senn, 2011

Deborah Mayo, Aris Spanos, and Kent Staley have edited a special issue of Rationality, Markets and Morals (RMM) (a rather weird combination, esp. for a journal name!) on “Statistical Science and Philosophy of Science: Where Do (Should) They Meet in 2011 and Beyond?” for which comments are open. Stephen Senn has a paper therein entitled You May Believe You Are a Bayesian But You Are Probably Wrong in his usual witty, entertaining, and… Bayesian-bashing style! I find it very kind of him to allow us to remain in the wrong, very kind indeed…

   

Now, the paper somehow intersects with the comments Stephen made on our review of Harold Jeffreys’ Theory of Probability a while ago. It contains a nice introduction to the four great systems of statistical inference, embodied by de Finetti, Fisher, Jeffreys, and Neyman plus Pearson. The main criticism of Bayesianism à la de Finetti is that it is so perfect as to be outworldish. And, since this perfection is lost in the practical implementation, there is no compelling reason to be a Bayesian. Worse, that all practical Bayesian implementations conflict with Bayesian principles. Hence a Bayesian author “in practice is wrong”. Stephen concludes with a call for eclecticism, quite in line with his usual style since this is likely to antagonise everyone. (I wonder whether or not having no final dot to the paper has a philosophical meaning. Since I have been caught in over-interpreting book covers, I will not say more!) As I will try to explain below, I believe Stephen has paradoxically himself fallen victim of over-theorising/philosophising! (Referring the interested reader to the above post as well as to my comments on Don Fraser’s “Is Bayes posterior quick and dirty confidence?” for more related points. Esp. about Senn’s criticisms of objective Bayes on page 52 that are not so central to this discussion… Same thing for the different notions of probability [p.49] and the relative difficulties of the terms in (2) [p.50]. Deborah Mayo has a ‘deconstructed” version of Stephen’s paper on her blog, with a much deeper if deBayesian philosophical discussion. And then Andrew Jaffe wrote a post in reply to Stephen’s paper. Whose points I cannot discuss for lack of time, but with an interesting mention of Jaynes as missing in Senn’s pantheon.)

  

The Bayesian theory is a theory on how to remain perfect but it does not explain how to become good.” Stephen Senn, 2011

While associating theories with characters is a reasonable rethoretical device, especially with large scale characters as the one above!, I think it deters the reader from a philosophical questioning on the theory behind the (big) man. (In fact, it is a form of bullying or, more politely (?), of having big names shoved down your throat as a form of argument.)  In particular, Stephen freezes the (Bayesian reasoning about the) Bayesian paradigm in its de Finetti phase-state, arguing about what de Finetti thought and believed. While this is historically interesting, I do not see why we should care at the praxis level. (I have made similar comments on this blog about the unpleasant aspects of being associated with one character, esp. the mysterious Reverent Bayes!) But this is not my main point.

…in practice things are not so simple.” Stephen Senn, 2011

The core argument in Senn’s diatribe is that reality is always more complex than the theory allows for and thus that a Bayesian has to compromise on her/his perfect theory with reality/practice in order to reach decisions. A kind of philosophical equivalent to Achille and the tortoise. However, it seems to me that the very fact that the Bayesian paradigm is a learning principle implies that imprecisions and imperfections are naturally endowed into the decision process. Thus avoiding the apparent infinite regress (Regress ins Unendliche) of having to run a Bayesian analysis to derive the prior for the Bayesian analysis at the level below (which is how I interpret Stephen’s first paragraph in Section 3). By refusing the transformation of a perfect albeit ideal Bayesian into a practical if imperfect bayesian (or coherent learner or whatever name that does not sound like being a member of a sect!), Stephen falls short of incorporating the contrainte de réalité into his own paradigm. The further criticisms found about prior justification, construction, evaluation (pp.59-60) are also of that kind, namely preventing the statistician to incorporate a degree of (probabilistic) uncertainty into her/his analysis.

In conclusion, reading Stephen’s piece was a pleasant and thought-provoking moment. I am glad to be allowed to believe I am a Bayesian, even though I do not believe it is a belief! The praxis of thousands of scientists using Bayesian tools with their personal degree of subjective involvement is an evolutive organism that reaches much further than the highly stylised construct of de Finetti (or of de Finetti restaged by Stephen!). And appropriately getting away from claims to being perfect or right. Or even being more philosophical.

Bayesian ideas and data analysis

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , on October 31, 2011 by xi'an

Here is [yet!] another Bayesian textbook that appeared recently. I read it in the past few days and, despite my obvious biases and prejudices, I liked it very much! It has a lot in common (at least in spirit) with our Bayesian Core, which may explain why I feel so benevolent towards Bayesian ideas and data analysis. Just like ours, the book by Ron Christensen, Wes Johnson, Adam Branscum, and Timothy Hanson is indeed focused on explaining the Bayesian ideas through (real) examples and it covers a lot of regression models, all the way to non-parametrics. It contains a good proportion of WinBugs and R codes. It intermingles methodology and computational chapters in the first part, before moving to the serious business of analysing more and more complex regression models. Exercises appear throughout the text rather than at the end of the chapters. As the volume of their book is more important (over 500 pages), the authors spend more time on analysing various datasets for each chapter and, more importantly, provide a rather unique entry on prior assessment and construction. Especially in the regression chapters. The author index is rather original in that it links the authors with more than one entry to the topics they are connected with (Ron Christensen winning the game with the highest number of entries).  Continue reading

Error and Inference [#4]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , on September 21, 2011 by xi'an

(This is the fourth post on Error and Inference, again and again yet being a raw and naïve reaction following a linear and slow reading of the book, rather than a deeper and more informed criticism.)

‘The defining feature of an inductive inference is that the premises (evidence statements) can be true while the conclusion inferred may be false without a logical contradiction: the conclusion is “evidence transcending”.”—D. Mayo and D. Cox, p.249, Error and Inference, 2010

The seventh chapter of Error and Inference, entitled “New perspectives on (some old) problems of frequentist statistics“, is divided in four parts, written by David Cox, Deborah Mayo and Aris Spanos, in different orders and groups of authors. This is certainly the most statistical of all chapters, not a surprise when considering that David Cox is involved, and I thus have difficulties to explain why it took me so long to read through it…. Overall, this chapter is quite important by its contribution to the debate on the nature of statistical testing.

‘The advantage in the modern statistical framework is that the probabilities arise from defining a probability model to represent the phenomenon of interest. Had Popper made use of the statistical testing ideas being developed at around the same time, he might have been able to substantiate his account of falsification.”—D. Mayo and D. Cox, p.251, Error and Inference, 2010

The first part of the chapter is Mayo’s and Cox’ “Frequentist statistics as a theory of inductive inference“. It was first published in the 2006 Erich Lehmann symposium. And available on line as an arXiv paper. There is absolutely no attempt there to link of clash with the Bayesian approach, this paper is only looking at frequentist statistical theory as the basis for inductive inference. The debate therein about deducing that H is correct from a dataset successfully facing a statistical test is classical (in both senses) but I [unsurprisingly] remain unconvinced by the arguments. The null hypothesis remains the calibrating distribution throughout the chapter, with very little (or at least not enough) consideration of what happens when the null hypothesis does not hold.  Section 3.6 about confidence intervals being another facet of testing hypotheses is representative of this perspective. The p-value is defended as the central tool for conducting hypothesis assessment. (In this version of the paper, some p’s are written in roman characters and others in italics, which is a wee confusing until one realises that this is a mere typo!)  The fundamental imbalance problem, namely that, in contiguous hypotheses, a test cannot be expected both to most often reject the null when it is [very moderately] false and to most often accept the null when it is right is not discussed there. The argument about substantive nulls in Section 3.5 considers a stylised case of well-separated scientific theories, however the real world of models is more similar to a greyish  (and more Popperian?) continuum of possibles. In connection with this, I would have thought more likely that the book would address on philosophical grounds Box’s aphorism that “all models are wrong”. Indeed, one (philosophical?) difficulty with the p-values and the frequentist evidence principle (FEV) is that they rely on the strong belief that one given model can be exact or true (while criticising the subjectivity of the prior modelling in the Bayesian approach). Even in the typology of types of null hypotheses drawn by the authors in Section 3, the “possibility of model misspecification” is addressed in terms of the low power of an omnibus test, while agreeing that “an incomplete probability specification” is unavoidable (an argument found at several place in the book that the alternative cannot be completely specified).

‘Sometimes we can find evidence for H0, understood as an assertion that a particular discrepancy, flaw, or error is absent, and we can do this by means of tests that, with high probability, would have reported a discrepancy had one been present.”—D. Mayo and D. Cox, p.255, Error and Inference, 2010

The above quote relates to the Failure and Confirmation section where the authors try to push the argument in favour of frequentist tests one step further, namely that that “moderate p-values” may sometimes be used as confirmation of the null. (I may have misunderstood, the end of the section defending a purely frequentist, as in repeated experiments, interpretation. This reproduces an earlier argument about the nature of probability in Section 1.2, as characterising the “stability of relative frequencies of results of repeated trials”) In fact, this chapter and other recent readings made me think afresh about the nature of probability, a debate that put me off so much in Keynes (1921) and even in Jeffreys (1939). From a mathematical perspective, there is only one “kind” of probability, the one defined via a reference measure and a probability, whether it applies to observations or to parameters. From a philosophical perspective, there is a natural issue about the “truth” or “realism” of the probability quantities and of the probabilistic statements. The book and in particular the chapter consider that a truthful probability statement is the one agreeing with “a hypothetical long-run of repeated sampling, an error probability”, while the statistical inference school of Keynes (1921), Jeffreys (1939), and Carnap (1962) “involves quantifying a degree of support or confirmation in claims or hypotheses”, which makes this (Bayesian) sound as less realistic… Obviously, I have no ambition to solve this long-going debate, however I see no reason in the first approach to be more realistic by being grounded on stable relative frequencies à la von Mises. If nothing else, the notion that a test should be evaluated on its long run performances is very idealistic as the concept relies on an ever-repeating, an infinite sequence of identical trials. Relying on probability measures as self-coherent mathematical measures of uncertainty carries (for me) as much (or as less) reality as the above infinite experiment. Now, the paper is not completely entrenched in this interpretation, when it concludes that “what makes the kind of hypothetical reasoning relevant to the case at hand is not the long-run low error rates associated with using the tool (or test) in this manner; it is rather what those error rates reveal about the data generating source or phenomenon” (p.273).

‘If the data are so extensive that accordance with the null hypothesis implies the absence of an effect of practical importance, and a reasonably high p-value is achieved, then it may be taken as evidence of the absence of an effect of practical importance.”—D. Mayo and D. Cox, p.263, Error and Inference, 2010

The paper mentions several times conclusions to be drawn from a p-value near one, as in the above quote. This is an interpretation that does not sit well with my understanding of p-values being distributed as uniforms under the null: very high  p-values should be as suspicious as very low p-values. (This criticism is not new, of course.) Unless one does not strictly adhere to the null model, which brings back the above issue of the approximativeness of any model… I also found fascinating to read the criticism that “power appertains to a prespecified rejection region, not to the specific data under analysis” as I thought this equally applied to the p-values, turning “the specific data under analysis” into a departure event of a prespecified kind.

(Given the unreasonable length of the above, I fear I will continue my snailpaced reading in yet another post!)

the theory that would not die…

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on September 19, 2011 by xi'an

A few days ago, I had lunch with Sharon McGrayne in a Parisian café and we had a wonderful chat about the people she had met during the preparation of her book, the theory that would not die. Among others, she mentioned the considerable support provided by Dennis Lindley, Persi Diaconis, and Bernard Bru. She also told me about a few unsavoury characters who simply refused to talk to her about the struggles and rise of Bayesian statistics. Then, once I had biked home, her book had at last arrived in my mailbox! How timely! (Actually, getting the book before would have been better, as I would have been able to ask more specific questions. But it seems the publisher, Yale University Press, had not forecasted the phenomenal success of the book and thus failed to scale the reprints accordingly!)

 Here is thus my enthusiastic (and obviously biased) reaction to the theory that would not die. It tells the story and the stories of Bayesian statistics and of Bayesians in a most genial and entertaining manner. There may be some who will object to such a personification of science, which should be (much) more than the sum of the characters who contributed to it. However, I will defend the perspective that (Bayesian) statistical science is as much philosophy as it is mathematics and computer-science, thus that the components that led to its current state were contributed by individuals, for whom the path to those components mattered. While the book inevitably starts with the (patchy) story of Thomas Bayes’s life, incl. his passage in Edinburgh, and a nice non-mathematical description of his ball experiment, the next chapter is about “the man who did everything”, …, yes indeed, Pierre-Simon (de) Laplace himself! (An additional nice touch is the use of lower case everywhere, instead of an inflation of upper case letters!) How Laplace attacked the issue of astronomical errors is brilliantly depicted, rooting the man within statistics and explaining  why he would soon move to the “probability of causes”. And rediscover plus generalise Bayes’ theorem. That his (rather unpleasant!) thirst for honours and official positions would cause later disrepute on his scientific worth is difficult to fathom, esp. when coming from knowledgeable statisticians like Florence Nightingale David. The next chapter is about the dark ages of [not yet] Bayesian statistics and I particularly liked the links with the French army, discovering there that the great Henri Poincaré testified at Dreyfus’ trial using a Bayesian argument, that Bertillon had completely missed the probabilistic point, and that the military judges were then all aware of Bayes’ theorem, thanks to Bertrand’s probability book being used at École Polytechnique! (The last point actually was less of a surprise, given that I had collected some documents about the involvement of late 19th/early 20th century artillery officers in the development of Bayesian techniques, Edmond Lhostes and Maurice Dumas, in connection with Lyle Broemeling’s Biometrika study.) The description of the fights between Fisher and Bayesians and non-Bayesians alike is as always both entertaining and sad. Sad also is the fact that Jeffreys’ masterpiece got so little recognition at the time. (While I knew about Fisher’s unreasonable stand on smoking, going as far as defending the assumption that “lung cancer might cause smoking”(!), the Bayesian analysis of Jerome Cornfield was unknown to me. And quite fascinating.) The figure of Fisher actually permeates the whole book, as a negative bullying figure preventing further developments of early Bayesian statistics, but also as an ambivalent anti-Bayesian who eventually tried to create his own brand of Bayesian statistics in the format of fiducial statistics…

…and then there was the ghastly de Gaulle.” D. Lindley

The following part of the theory that would not die is about Bayes’ contributions to the war (WWII), at least from the Allied side. Again, I knew most of the facts about Alan Turing and Bletchley Park’s Enigma, however the story is well-told and, as in previous occasions, I cannot but be moved by the waste of such a superb intellect, thanks to the stupidity of governments. The role of Albert Madansky in the assessment of the [lack of] safety of nuclear weapons is also well-described, stressing the inevitability of a Bayesian assessment of a one-time event that had [thankfully] not yet happened. The above quote from Dennis Lindley is the conclusion of his argument on why Bayesian statistics were not called Laplacean; I would think instead that the French post-war attraction for abstract statistics in the wake of Bourbaki did more against this recognition than de Gaulle’s isolationism and ghastliness. The involvement of John Tukey into military research was also a novelty for me, but not so much as his use of Bayesian [small area] methods for NBC election night previsions. (They could not hire José nor Andrew at the time.) The conclusion of Chapter 14 on why Tukey felt the need to distance himself from Bayesianism is quite compelling. Maybe paradoxically, I ended up appreciating Chapter 15 even more for the part about the search for a missing H-bomb near Palomares, Spain, as it exposes the plusses a Bayesian analysis would have brought.

There are many classes of problems where Bayesian analyses are reasonable, mainly classes with which I have little acquaintance.” J. Tukey

When approaching near recent times and to contemporaries,  Sharon McGrayne gives a very detailed coverage of the coming-of-age of Bayesians like Jimmy Savage and Dennis Lindley, as well as the impact of Stein’s paradox (a personal epiphany!), along with the important impact of Howard Raiffa and Robert Schlaifer, both on business schools and on modelling prior beliefs [via conjugate priors]. I did not know anything about their scientific careers, but Applied Statistical Decision Theory is a beautiful book that prefigured both DeGroot‘s and Berger‘s. (As an aside, I was amused by Raiffa using Bayesian techniques for horse betting based on race bettors, as I had vaguely played with the idea during my spare if compulsory time in the French Navy!) Similarly, while I’d read detailed scientific accounts of Frederick Mosteller’s and David Wallace’s superb Federalist Papers study, they were only names to me. Chapter 12 mostly remedied this lack of mine’s.

We are just starting” P. Diaconis

The final part, entitled Eureka!, is about the computer revolution we witnessed in the 1980′s, culminating with the (re)discovery of MCMC methods we covered in our own “history”. Because it contains stories that are closer and closer to today’s time, it inevitably crumbles into shorter and shorter accounts. However, the theory that would not die conveys the essential message that Bayes’ rule had become operational, with its own computer language and objects like graphical models and Bayesian networks that could tackle huge amounts of data and real-time constraints. And used by companies like Microsoft and Google. The final pages mention neurological experiments on how the brain operates in a Bayesian-like way (a direction much followed by neurosciences, as illustrated by Peggy Series’ talk at Bayes-250).

In conclusion, I highly enjoyed reading through the theory that would not die. And I am sure most of my Bayesian colleagues will as well. Being Bayesians, they will compare the contents with their subjective priors about Bayesian history, but will in the end update those profitably. (The most obvious missing part is in my opinion the absence of E.T Jaynes and the MaxEnt community, which would deserve a chapter on its own.) Maybe ISBA could consider supporting a paperback or electronic copy to distribute to all its members! As an insider, I have little idea on how the book would be perceived by the layman: it does not contain any formula apart from [the discrete] Bayes’ rule at some point, so everyone can read through.  The current success of the theory that would not die shows that it reaches much further than academic circles. It may be that the general public does not necessarily grasp the ultimate difference between frequentist and Bayesians, or between Fisherians and Neyman-Pearsonians. However the theory that would not die goes over all the elements that explain these differences. In particular, the parts about single events are quite illuminating on the specificities of the Bayesian approach. I will certainly [more than] recommend it to all of my graduate students (and buy the French version for my mother once it is translated, so that she finally understands why I once gave a talk “Don’t tell my mom I am Bayesian” at ENSAE…!) If there is any doubt from the above, I obviously recommend the book to all Og’s readers!

Follow

Get every new post delivered to your Inbox.

Join 551 other followers