You assume that I am interested in long-term average properties of procedures, even though I have so often argued that they are at most necessary (as consequences of good procedures), but scarcely sufficient for a severity assessment. The error statistical account I have developed is a statistical philosophy. It is not one to be found in Neyman and Pearson, jointly or separately, except in occasional glimpses here and there (unfortunately). It is certainly not about well-defined accept-reject rules. If N-P had only been clearer, and Fisher better behaved, we would not have had decades of wrangling. However, I have argued, the error statistical philosophy explicates, and directs the interpretation of, frequentist sampling theory methods in scientific, as opposed to behavioural, contexts. It is not a complete philosophy…but I think Gelmanian Bayesians could find in it a source of “standard setting”.
You say “the prior is both a probabilistic object, standard from this perspective, and a subjective construct, translating qualitative personal assessments into a probability distribution. The extension of this dual nature to the so-called “conventional” priors (a very good semantic finding!) is to set a reference … against which to test the impact of one’s prior choices and the variability of the resulting inference. …they simply set a standard against which to gauge our answers.”
I think there are standards for even an approximate meaning of “standard-setting” in science, and I still do not see how an object whose meaning and rationale may fluctuate wildly, even in a given example, can serve as a standard or reference. For what?
Perhaps the idea is that one can gauge how different priors change the posteriors, because, after all, the likelihood is well-defined. That is why the prior and not the likelihood is the camel. But it isn’t obvious why I should want the camel. (camel/gnat references in the paper and response).
Archive for Error and Inference
Another focussed spam in the mail:
Dear Dr. Christian P. Robert,
How are you?
I read your interesting article of “Error and inference: an outsider stand on a frequentist philosophy“, and I know you are an active professional in this field.
Now, I am writing you to call for new papers, on behalf of Review of Economics & Finance, which is an English quarterly journal in Canada.
This journal is currently indexed by EconLit of American Economic Association (AEA), EBSCO, RePEc, National Bibliography of Canada, Library and Archives Canada, DOAJ, Ulrich, and so on.
The publication fee is CAD$450, if your paper is qualified for publication after refereeing. The submission fee of $50 is NOT applied to you by March 6th, 2013.
Thank you for your consideration. Have a rewarding month!
At least, they are quite honest about the cost of publishing there. But they should check on which paper they pick rather than using a robot that takes a paper not talking about economics or finance… (And why on Earth this line about the rewarding month?!)
As in previous years, let me warn unwary readers that the links to Amazon.com and Amazon.fr found on this blog are actually susceptible to earn me a monetary gain [from 4% to 7%] if a purchase is made in the 24 hours following the entry on Amazon through this link, thanks to the “Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com/fr“. As with last year, most of the items purchased through the links and contributing to my bookoholic addiction are rather unrelated with the purpose of the ‘Og, but, as already mentioned , anything can happen within 24 hours! Here are the weirdest ones:
- Toy Vault 12″ Cthulhu Plush Toy and Pokemon Pillow
- Stewart’s Pro-Treat 21 ounce Tub Freeze Dried Dog Treats, Beef Liver
- ten copies of Hp Compaq Evo N610v Ac Adapter 0mAh
- Philosophy Purity Made Simple One-Step Facial Cleanser
- Nile Spice Lentil Soup
- ten copies of Donald in Mathmagic Land
plus of course the books I actually reviewed along the past months, positively or negatively… Like seven copies of Error and Inference. And a dozen of R for dummies. And many other books on Bayesian statistics and R programming. Thanks!
This CRC Press book was sent to me for review in CHANCE: Paradoxes in Scientific Inference is written by Mark Chang, vice-president of AMAG Pharmaceuticals. The topic of scientific paradoxes is one of my primary interests and I have learned a lot by looking at Lindley-Jeffreys and Savage-Dickey paradoxes. However, I did not find a renewed sense of excitement when reading the book. The very first (and maybe the best!) paradox with Paradoxes in Scientific Inference is that it is a book from the future! Indeed, its copyright year is 2013 (!), although I got it a few months ago. (Not mentioning here the cover mimicking Escher’s “paradoxical” pictures with dices. A sculpture due to Shigeo Fukuda and apparently not quoted in the book. As I do not want to get into another dice cover polemic, I will abstain from further comments!)
Now, getting into a deeper level of criticism (!), I find the book very uneven and overall quite disappointing. (Even missing in its statistical foundations.) Esp. given my initial level of excitement about the topic!
First, there is a tendency to turn everything into a paradox: obviously, when writing a book about paradoxes, everything looks like a paradox! This means bringing into the picture every paradox known to man and then some, i.e., things that are either un-paradoxical (e.g., Gödel’s incompleteness result) or uninteresting in a scientific book (e.g., the birthday paradox, which may be surprising but is far from a paradox!). Fermat’s theorem is also quoted as a paradox, even though there is nothing in the text indicating in which sense it is a paradox. (Or is it because it is simple to express, hard to prove?!) Similarly, Brownian motion is considered a paradox, as “reconcil[ing] the paradox between two of the greatest theories of physics (…): thermodynamics and the kinetic theory of gases” (p.51) For instance, the author considers the MLE being biased to be a paradox (p.117), while omitting the much more substantial “paradox” of the non-existence of unbiased estimators of most parameters—which simply means unbiasedness is irrelevant. Or the other even more puzzling “paradox” that the secondary MLE derived from the likelihood associated with the distribution of a primary MLE may differ from the primary. (My favourite!)
“When the null hypothesis is rejected, the p-value is the probability of the type I error.” Paradoxes in Scientific Inference (p.105)
“The p-value is the conditional probability given H0.” Paradoxes in Scientific Inference (p.106)
Second, the depth of the statistical analysis in the book is often found missing. For instance, Simpson’s paradox is not analysed from a statistical perspective, only reported as a fact. Sticking to statistics, take for instance the discussion of Lindley’s paradox. The author seems to think that the problem is with the different conclusions produced by the frequentist, likelihood, and Bayesian analyses (p.122). This is completely wrong: Lindley’s (or Lindley-Jeffreys‘s) paradox is about the lack of significance of Bayes factors based on improper priors. Similarly, when the likelihood ratio test is introduced, the reference threshold is given as equal to 1 and no mention is later made of compensating for different degrees of freedom/against over-fitting. The discussion about p-values is equally garbled, witness the above quote which (a) conditions upon the rejection and (b) ignores the dependence of the p-value on a realized random variable. Continue reading
A few days ago, while in Roma, I got the good news that my review of Error and Inference had been accepted by Theory and Decision. Great! Then today I got a second email asking me to connect to a Springer site entitled “Services for Authors” with the following message:
Dear Christian Robert!
Thank you for publishing your paper in one of Springer’s journals.
Article Title: Error and Inference: an outsider stand on a frequentist philosophy
Journal: Theory and Decision
Make your Choice
In order to facilitate the production and publication of your article we need further information from you relating to:
- Please indicate if you would like to publish your article as open access with Springer’s Open Choice option (by paying a publication fee or as a result of an agreement between your funder/institution and Springer). I acknowledge that publishing my article with open access costs € 2000 / US $3000 and that this choice is final and cannot be cancelled later.
- Please transfer the copyright, if you do not publish your articles as open access.
- Please indicate if you would like to have your figures printed in color.
- Please indicate if you would like to order offprints. You have the opportunity to order a poster of your article against a fee of €50 per poster. The poster features the cover page of the issue your article is published in together with the article title and the names of all contributing authors.
Now I feel rather uncomfortable with the above options since I do not see why I should pay a huge amount
€ 2000 € for having my work/review made again available. Since it is already freely accessible on arXiv. And it is only a book-review, for Gutenberg’s sake! Last year, we made our PNAS paper available as Open Access, but this was (a) cheaper and (b) an important result, or so we thought! The nice part of the message was that for once I did not have to sign and send back a paper copy of the copyright agreement as with so many journals and as if we still were in the 19th Century… (I do not see the point in the poster, though!)
As I was writing my next column for CHANCE, I decided I will include a methodology box about “using the data twice”. Here is the draft. (The second part is reproduced verbatim from an earlier post on Error and Inference.)
Several aspects of the books covered in this CHANCE review [i.e., Bayesian ideas and data analysis, and Bayesian modeling using WinBUGS] face the problem of “using the data twice”. What does that mean? Nothing really precise, actually. The accusation of “using the data twice” found in the Bayesian literature can be thrown at most procedures exploiting the Bayesian machinery without actually being Bayesian, i.e.~which cannot be derived from the posterior distribution. For instance, the integrated likelihood approach in Murray Aitkin’s Statistical Inference avoids the difficulties related with improper priors πi by first using the data x to construct (proper) posteriors πi(θi|x) and then secondly using the data in a Bayes factor
as if the posteriors were priors. This obviously solves the improperty difficulty (see. e.g., The Bayesian Choice), but it creates a statistical procedure outside the Bayesian domain, hence requiring a separate validation since the usual properties of Bayesian procedures do not apply. Similarly, the whole empirical Bayes approach falls under this category, even though some empirical Bayes procedures are asymptotically convergent. The pseudo-marginal likelihood of Geisser and Eddy (1979), used in Bayesian ideas and data analysis, is defined by
through the marginal posterior likelihoods. While it also allows for improper priors, it does use the same data in each term of the product and, again, it is not a Bayesian procedure.
Once again, from first principles, a Bayesian approach should use the data only once, namely when constructing the posterior distribution on every unknown component of the model(s). Based on this all-encompassing posterior, all inferential aspects should be the consequences of a sequence of decision-theoretic steps in order to select optimal procedures. This is the ideal setting while, in practice, relying on a sequence of posterior distributions is often necessary, each posterior being a consequence of earlier decisions, which makes it the result of a multiple (improper) use of the data… For instance, the process of Bayesian variable selection is on principle clean from the sin of “using the data twice”: one simply computes the posterior probability of each of the variable subsets and this is over. However, in a case involving many (many) variables, there are two difficulties: one is about building the prior distributions for all possible models, a task that needs to be automatised to some extent; another is about exploring the set of potential models. First, ressorting to projection priors as in the intrinsic solution of Pèrez and Berger (2002, Biometrika, a much valuable article!), while unavoidable and a “least worst” solution, means switching priors/posteriors based on earlier acceptances/rejections, i.e. on the data. Second, the path of models truly explored by a computational algorithm [which will be a minuscule subset of the set of all models] will depend on the models rejected so far, either when relying on a stepwise exploration or when using a random walk MCMC algorithm. Although this is not crystal clear (there is actually plenty of room for supporting the opposite view!), it could be argued that the data is thus used several times in this process…
In connection with my series of posts on the book Error and Inference, and my recent collation of those into an arXiv document, Deborah Mayo has started a series of informal seminars at the LSE on the philosophy of errors in statistics and the likelihood principle. and has also posted a long comment on my argument about only using wrong models. (The title is inspired from the Rolling Stones’ “You can’t always get what you want“, very cool!) The discussion about the need or not to take into account all possible models (which is the meaning of the “catchall hypothesis” I had missed while reading the book) shows my point was not clear. I obviously do not claim in the review that all possible models should be accounted for at once, this was on the opposite my understanding of Mayo’s criticism of the Bayesian approach (I thought the following sentence was clear enough: “According to Mayo, this alternative hypothesis should “include all possible rivals, including those not even though of” (p.37)”)! So I see the Bayesian approach as a way to put on the table a collection of reasonable (if all wrong) models and give to those models a posterior probability, with the purpose that improbable ones are eliminated. Therefore, I am in agreement with most of the comments in the post, esp. because this has little to do with Bayesian versus frequentist testing! Even rejecting the less likely models from a collection seems compatible with a Bayesian approach, model averaging is not always an appropriate solution, depending on the loss function!