## two, three, five, …, a million standard deviations!

I first spotted Peter Coles’ great post title “Frequentism: the art of probably answering the wrong question” (a very sensible piece by the way!, and mentioning a physicist’s view on the Jeffreys-Lindley paradox I had intended to comment) and from there the following site jumping occured:

“I confess that in my early in my career as a physicist I was rather cynical about sophisticated statistical tools, being of the opinion that “if any of this makes a difference, just get more data”. That is, if you do enough experiments, the confidence level will be so high that the exact statistical treatment you use to evaluate it is irrelevant.” John Butterworth, Sept. 15, 2014

**A**fter Val Johnson‘s suggestion to move the significant level from .05 down to .005, hence roughly from 2σ up to 3σ, John Butterworth, a physicist whose book Smashing Physics just came out, discusses in The Guardian the practice of using 5σ in Physics. It is actually induced by Louis Lyons’ arXival of a recent talk with the following points (discussed below):

- Should we insist on the 5 sigma criterion for discovery claims?
- The probability of A, given B, is not the same as the probability of B, given A.
- The meaning of p-values.
- What is Wilks Theorem and when does it not apply?
- How should we deal with the `Look Elsewhere Effect’?
- Dealing with systematics such as background parametrisation.
- Coverage: What is it and does my method have the correct coverage?
- The use of p0 versus p1 plots.

**B**utterworth’s conclusion is worth reproducing:

“…there’s a need to be clear-eyed about the limitations and advantages of the statistical treatment, wonder what is the “elsewhere” you are looking at, and accept that your level of certainty may never feasibly be 5σ. In fact, if the claims being made aren’t extraordinary, a one-in-2million chance of a mistake may indeed be overkill, as well being unobtainable. And you have to factor in the consequences of acting, or failing to act, based on the best evidence available – evidence that should include a good statistical treatment of the data.” John Butterworth, Sept. 15, 2014

esp. the part about the “consequences of acting”, which I interpret as incorporating a loss function in the picture.

**L**ouis’s paper-ised talk 1. [somewhat] argues in favour of the 5σ because 2σ and 3σ are not necessarily significant on larger datasets. I figure the same could be said of 5σ, no?! He also mentions (a) “systematics”, which I do not understand. Even though this is not the first time I encounter the notion in Physics. And (b) “subconscious Bayes factors”, which means that the likelihood ratio [considered here as a transform of the p-value] is moderated by the ratio of the prior probabilities, even when people do not follow a Bayesian procedure. But this does not explain why a fixed deviation from the mean should be adopted. 2. and 3. The following two points are about the common confusion in the use of the p-value, found in most statistics textbooks. Even though the defence of the p-value against the remark that it is wrong half the time (as in Val’s PNAS paper) misses the point. 4. *Wilk’s theorem* is a warning that the χ² approximation only operates under some assumptions. 5. *Looking elsewhere* is the translation of multiple testing or cherry-picking. 6. *Systematics* is explained here as a form of model misspecification. One suggestion is to use a Bayesian modelling of this misspecification, another non-parametrics (why not both together?!). 7. *Coverage* is somewhat disjunct from the other points as it explains the [frequentist] meaning of the coverage of a confidence interval. Which hence does not apply to the actual data. 8. *p0 versus p1* plots is a sketchy part referring to a recent proposal by the author. So in the end a rather anticlimactic coverage of standard remarks, surprisingly giving birth to a sequence of posts (incl. this one!)…

September 29, 2014 at 5:03 pm

John Butterworth wrote a sequel in the Sunday edition of The Guardian, entitled Belief, Bias and Bayes, with the stingy “Statistics gets some people excited¹, especially, it seems, when Bayes is mentioned.”

September 27, 2014 at 5:03 am

For a link to some earlier discussions that give my take on the frequentist’s analysis of the Higg’s statistics: http://errorstatistics.com/2014/07/10/higgs-discovery-two-years-on-2-higgs-analysis-and-statistical-flukes/

A link to Robert Cousin’s paper can also be found there.

I will study your links as some of us are preparing for a symposium on at the philo of science association on the topic very soon.