## the anti-Bayesian moment and its passing

**T**oday, our reply to the discussion of our American Statistician paper “*Not only defended but also applied*” by Stephen Fienberg, Wes Johnson, Deborah Mayo, and Stephen Stiegler,, was posted on arXiv. It is kind of funny that this happens the day I am visiting Iowa State University Statistics Department, a department that was formerly a Fisherian and thus anti-Bayesian stronghold. (Not any longer, to be sure! I was also surprised to discover that before the creation of the department, Henry Wallace, came to lecture on machine calculations for statistical methods…in 1924!)

**T**he reply to the discussion was rewritten and much broadened by Andrew after I drafted a more classical point-by-point reply to our four discussants, much to its improvement. For one thing, it reads well on its own, as the discussions are not yet available on-line. For another, it gives a broader impact of the discussion, which suits well the readership of *The American Statistician*. (Some of my draft reply is recycled in this post.)

**H**ere are two interesting quotes made by Stephen Stiegler from a 1928 book by Thornton C. Fry called *Probability and its Engineering Uses*:

“We would be glad, if we could, to get a measure of our certainty in [scientific problems]; but there is nothing in the Theory of Probability to aid us except “Bayes’ Theorem, which we can seldom use for the purpose because … we cannot measure the unconditional [i.e. prior] probabilities. … To such problems, then, no exact answer can be expected from the use of Bayes’ Theorem; not because of any logical uncertainty as to the theorem itself*, but because we do not possess the data necessary for its use. On the other hand it is often of service in dealing with problems to which qualitative answers are acceptable [i.e. answers giving at least an order of magnitude of the uncertainty].

and

“Due to numerous inexact statements which have been made of it, Bayes’ Theorem has been the subject of much adverse criticism, and some authorities have even gone so far as to reject it entirely. At present, however, this criticism seems to be dying out, the commonly accepted view being much the same as that stated above: that it is just as sound logically as any other part of the Theory of Probability, and may be trusted to give reliable results when we can get a grip on it. The trouble is that we so seldom can.”

**I**ndeed, I find it interesting to witness the intellectual and not so unusual position of Fry about prior distributions, a position that sruvived to this day, namely that the prior distribution cannot be determined exactly nor measured to a satisfying precision. This attitude towards the prior (in which we have no more reason to believe than in the randomness of the parameter) is often found in applied fields where scientists spend an unusual amount of space and time searching for “the” prior or justifying their use of an “approximation”, seemingly convinced of the existence of a *deus ex machina* that would have chosen a prior distribution once and forever. That it primarily occurs in the physical sciences may reflect the attitude of those sciences towards imprecise or subjective modelling, even though models themselves are generally seen as temporary approximations to the “truth” This kink of mental block in the perception of Bayesian analysis certainly did as much harm as the admittedly embarrassing complexity of setting and testing a prior distribution. At least in the days before cheap computing became available.

“.

…contemporary Bayesianism is in need of new foundations; whether they are to be found in non-Bayesian testing, or elsewhere.” D. Mayo

**A**s in most of her writings, Deborah Mayo gives the impression in her thorough and wide-ranging discussion on “the foundational defence of Bayesianism” that most of Statistics is under threat of being overcome (or “inundated”) by the Bayesian perspective. (I am quite grateful for the dedication to George‘s memory of her comments.) We can reassure her that this is definitely not the case, even in applied fields. For instance, the fact that younger statisticians may be unaware of the passionate battles of the past century can be interpreted in a completely different light, namely as a lack of adverse reaction to the use of Bayesian or non-Bayesian perspectives. Similarly, it does not seem true (as perceived from our daily practice) that “few readers are unaware of the (…) criticisms” (p.3) launched at Newyman-Pearson (testing) statistics, given that the large majority of tests are indeed conducted following their principles.

“

Bayesian testing seems to be in a state of flux. The authors’ invitation to test Bayesian models, including priors, is welcome; but the results of testing are clearly going to depend on explicating the intended interpretation of whatever is being tested.” D. Mayo

**A**s a minor comment, “employ[ing] frequentist error probabilities to appraise and ensure the probative capacity, or severity, of tests, being sensitive to the actual data and claim to be inferred” does not seem (to me) to support more the Neyman-Pearson methodology than the recourse of Bayesian predictives. I (we?) am certainly more agnostic than fundamentalist in this debate on foundations, as I think that, like most statisticians, philosophers should acknowledge the elusive nature of “truth” in statistical problems and accept the possibility of multiverses of answers, in contrast with other sciences. Having multiple responses when faced with a given dataset is neither a sin, nor an epistemological paradox, because the map is not the territory, the data is not the model.

**I**n a finally unifying theme for this discussion and reply, I stress my earlier point that there is no gold standard for the prior distribution in a given problem and hence that “the interpretation and justification for the prior probability distribution” is not meaningful to the extent debated by Deborah Mayo: the prior is both a probabilistic object, standard from this perspective, and a subjective construct, translating qualitative personal assessments into a probability distribution. The extension of this dual nature to the so-called “conventional” priors (a very good semantic finding!) is to set a reference (hence the “reference priors” of Bernardo and Berger) framework against which to test the impact of one’s prior choices and the variability of the resulting inference. I do not find any particular difficulty in the variety of choices for those conventional priors as they simply set a standard against which to gauge our answers. *In fine*, assuming enough computing power is available, we can always run (frequentist, nothing else!) simulations to check for the long-term or average properties of different Bayesian procedures, provided those properties are truly the reason for the statistical analysis and not a textbook choice of an artificial loss function.

March 10, 2013 at 5:21 am

You assume that I am interested in long-term average properties of procedures, even though I have so often argued that they are at most necessary (as consequences of good procedures), but scarcely sufficient for a severity assessment. The error statistical account I have developed is a statistical philosophy. It is not one to be found in Neyman and Pearson, jointly or separately, except in occasional glimpses here and there (unfortunately). It is certainly not about well-defined accept-reject rules. If N-P had only been clearer, and Fisher better behaved,we would not have had decades of wrangling. However, I have argued, the error statistical philosophy explicates, and directs the interpretation of, frequentist sampling theory methods in scientific, as opposed to behavioral, contexts. It is not a complete philosophy…but I think Gelmanian Bayesians could find in it a source of “standard setting”.

You say “the prior is both a probabilistic object, standard from this perspective, and a subjective construct, translating qualitative personal assessments into a probability distribution. The extension of this dual nature to the so-called “conventional” priors (a very good semantic finding!) is to set a reference … against which to test the impact of one’s prior choices and the variability of the resulting inference. …they simply set a standard against which to gauge our answers.”

I think there are standards for even an approximate meaning of “standard-setting” in science, and I still do not see how an object whose meaning and rationale may fluctuate wildly, even in a given example, can serve as a standard or reference. For what?

Perhaps the idea is that one can gauge how different priors change the posteriors, because, after all, the likelihood is well defined. That is why the prior and not the likelihood is the camel. But it isn’t obvious why I should want the camel. (camel/gnat references in the paper and response)