## two, three, five, …, a million standard deviations!

Posted in Books, Statistics, University life with tags , , , , , , , on September 26, 2014 by xi'an

I first spotted Peter Coles’ great post title “Frequentism: the art of probably answering the wrong question” (a very sensible piece by the way!, and mentioning a physicist’s view on the Jeffreys-Lindley paradox I had intended to comment) and from there the following site jumping occured:

“I confess that in my early in my career as a physicist I was rather cynical about sophisticated statistical tools, being of the opinion that “if any of this makes a difference, just get more data”. That is, if you do enough experiments, the confidence level will be so high that the exact statistical treatment you use to evaluate it is irrelevant.” John Butterworth, Sept. 15, 2014

After Val Johnson‘s suggestion to move the significant level from .05 down to .005, hence roughly from 2σ up to 3σ, John Butterworth, a physicist whose book Smashing Physics just came out, discusses in The Guardian the practice of using 5σ in Physics. It is actually induced by Louis Lyons’ arXival of a recent talk with the following points (discussed below):

1. Should we insist on the 5 sigma criterion for discovery claims?
2. The probability of A, given B, is not the same as the probability of B, given A.
3. The meaning of p-values.
4. What is Wilks Theorem and when does it not apply?
5. How should we deal with the `Look Elsewhere Effect’?
6. Dealing with systematics such as background parametrisation.
7. Coverage: What is it and does my method have the correct coverage?
8. The use of p0 versus p1 plots.

## Le Monde puzzle [#849]

Posted in Books, Kids, R, Statistics with tags , , , , , on January 19, 2014 by xi'an A straightforward Le Monde mathematical puzzle:

Find a pair (a,b) of integers such that a has an odd number d of digits larger than 2 and ab is written as 10d+1+10a+1. Find the smallest possible values of a and of b.

I ran the following R code

```d=3
for (a in 10^(d-1):(10^d-1)){
c=10^(d+1)+10*a+1
if (a*trunc(c/a)==c)
print(c(a,c))}
```

which produced a=137 (and b=83) as the unique case. For d=4, I obtained a=9091 and b=21, for d=6, a=909091, and b=21, for d=7, a=5882353 and b=27, while for d=5, my code did not return any solution. While d=8 took too long to run, a prime factor decomposition of 10⁹+1 leads to (with the schoolmath R library)

```> for (d in 3:10) print(c(d,prime.factor(10^(d+1)+1)))
   3  73 137
    4   11 9091
    5  101 9901
      6     11 909091
       7      17 5882353
     8     7    11    13    19 52579
     9   101  3541 27961
   10   11   11   23 4093 8779
```

which gives a=52631579 and b=29 for d=8 and also explains why there is no solution for d=5. The corresponding a has too many digits!

This issue of Le Monde Science&Médecine leaflet had more interesting entries, from one on “LaTeX as the lingua franca of mathematicians”—which presumably made little sense to any reader unfamiliar with LaTeX—to the use of “big data” tools (like news rover) to analyse data produce by the medias, to  yet another tribune of Marco Zito about the “five sigma” rule used in particle physics (and for the Higgs boson analysis)—with the reasonable comment that a large number of repetitions of an experiment is likely to exhibit unlikely events, and an also reasonable recommendation to support “reproduction experiments” that aim at repeating exceptional phenomena—, to a solution to puzzle #848—where the resolution is the same as mine’s, but mentions the principle of Dirichlet’s drawers to exclude the fact that all prices are different, a principle I had never heard off…