## double if not exponential

Posted in Books, Kids, Statistics, University life with tags , , , , , , on December 10, 2020 by xi'an

In one of my last quizzes for the year, as the course is about to finish, I asked whether mean or median was the MLE for a double exponential sample of odd size, without checking for the derivation of the result, as I was under the impression it was a straightforward result. Despite being outside exponential families. As my students found it impossible to solve within the allocated 5 minutes, I had a look, could not find an immediate argument (!), and used instead this nice American Statistician note by Robert Norton based on the derivative being the number of observations smaller than θ minus the number of observations larger than θ.  This leads to the result as well as the useful counter-example of a range of MLE solutions when the number of observations is even.

## inverse Gaussian trick [or treat?]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , , , , , , , on October 29, 2020 by xi'an

When preparing my mid-term exam for my undergrad mathematical statistics course, I wanted to use the inverse Gaussian distribution IG(μ,λ) as an example of exponential family and include a random generator question. As shown above by a Fortran computer code from Michael, Schucany and Haas, a simple version can be based on simulating a χ²(1) variate and solving in x the following second degree polynomial equation

$\dfrac{\lambda(x-\mu)^2}{\mu^2 x} = v$

since the left-hand side transform is distributed as a χ²(1) random variable. The smallest root x¹, less than μ, is then chosen with probability μ/(μ+x¹) and the largest one, x²=μ²/x¹ with probability x¹/(μ+x¹). A relatively easy question then, except when one considers asking for the proof of the χ²(1) result, which proved itself to be a harder cookie than expected! The paper usually referred to for the result, Schuster (1968), is quite cryptic on the matter, essentially stating that the above can be expressed as the (bijective) transform of Y=min(X,μ²/X) and that V~χ²(1) follows immediately. I eventually worked out a proof by the “law of the unconscious statistician” [a name I do not find particularly amusing!], but did not include the question in the exam. But I found it fairly interesting that the inverse Gaussian can be generating by “inverting” the above equation, i.e. going from a (squared) Gaussian variate V to the inverse Gaussian variate X. (Even though the name stems from the two cumulant generating functions being inverses of one another.)

## abandoned, one year ago…

Posted in Books, Statistics, University life with tags , , , , on March 17, 2020 by xi'an

## retire statistical significance [follow-up]

Posted in Statistics with tags , , , , , , , , , , , , , , on December 9, 2019 by xi'an

[Here is a brief update sent by my coauthors Valentin, Sander, and Blake on events following the Nature comment “Retire Statistical Significance“.]

In the eight months since publication of the comment and of the special issue of The American Statistician, we are glad to see a rich discussion on internet blogs and in scholarly publications and popular media.Nature

One important indication of change is that since March numerous scientific journals have published editorials or revised their author guidelines. We have selected eight editorials that not only discuss statistics reform but give concrete new guidelines to authors. As you will see, the journals differ in how far they want to go with the reform (all but one of the following links are open access).

1) The New England Journal of Medicine, “New Guidelines for Statistical Reporting in the Journal

2) Pediatric Anesthesia, “Embracing uncertainty: The days of statistical significance are numbered

3) Journal of Obstetric, Gynecologic & Neonatal Nursing, “The Push to Move Health Care Science Beyond p < .05

4) Brain and Neuroscience Advances, “Promoting and supporting credibility in neuroscience

5) Journal of Wildlife Management, “Vexing Vocabulary in Submissions to the Journal of Wildlife Management”

6) Demographic Research, “P-values, theory, replicability, and rigour

7) Journal of Bone and Mineral Research, “New Guidelines for Data Reporting and Statistical Analysis: Helping Authors With Transparency and Rigor in Research

8) Significance, “The S word … and what to do about it

Further, some of you took part in a survey by Tom Hardwicke and John Ioannidis that was published in the European Journal of Clinical Investigation along with editorials by Andrew Gelman and Deborah Mayo.

We replied with a short commentary in that journal, “Statistical Significance Gives Bias a Free Pass

And finally, joining with the American Statistical Association (ASA), the National Institute of Statistical Sciences (NISS) in the United States has also taken up the reform issue.

## Galton’s board all askew

Posted in Books, Kids, R with tags , , , , , , , on November 19, 2019 by xi'an

Since Galton’s quincunx has fascinated me since the (early) days when I saw a model of it as a teenager in an industry museum near Birmingham, I jumped on the challenge to build an uneven nail version where the probabilities to end up in one of the boxes were not the Binomial ones. For instance,  producing a uniform distribution with the maximum number of nails with probability ½ to turn right. And I obviously chose to try simulated annealing to figure out the probabilities, facing as usual the unpleasant task of setting the objective function, calibrating the moves and the temperature schedule. Plus, less usually, a choice of the space where the optimisation takes place, i.e., deciding on a common denominator for the (rational) probabilities. Should it be 2⁸?! Or more (since the solution with two levels also involves 1/3)? Using the functions

evol<-function(P){
Q=matrix(0,7,8)
Q[1,1]=P[1,1];Q[1,2]=1-P[1,1]
for (i in 2:7){
Q[i,1]=Q[i-1,1]*P[i,1]
for (j in 2:i)
Q[i,j]=Q[i-1,j-1]*(1-P[i,j-1])+Q[i-1,j]*P[i,j]
Q[i,i+1]=Q[i-1,i]*(1-P[i,i])
Q[i,]=Q[i,]/sum(Q[i,])}
return(Q)}


and

temper<-function(T=1e3){
bestar=tarP=targ(P<-matrix(1/2,7,7))
temp=.01
while (sum(abs(8*evol(R.01){
for (i in 2:7)
R[i,sample(rep(1:i,2),1)]=sample(0:deno,1)/deno
if (log(runif(1))/temp<tarP-(tarR<-targ(R))){P=R;tarP=tarR}
for (i in 2:7) R[i,1:i]=(P[i,1:i]+P[i,i:1])/2
if (log(runif(1))/temp<tarP-(tarR<-targ(R))){P=R;tarP=tarR}
if (runif(1)<1e-4) temp=temp+log(T)/T}
return(P)}


I first tried running my simulated annealing code with a target function like

targ<-function(P)(1+.1*sum(!(2*P==1)))*sum(abs(8*evol(P)[7,]-1))

where P is the 7×7 lower triangular matrix of nail probabilities, all with a 2⁸ denominator, reaching

60
126 35
107 81 20
104 71 22 0
126 44 26 69 14
61 123 113 92 91 38
109 60 7 19 44 74 50

for 128P. With  four entries close to 64, i.e. ½’s. Reducing the denominator to 16 produced once

8
12 1
13 11 3
16  7  6   2
14 13 16 15 0
15  15  2  7   7  4
8   0    8   9   8  16  8

as 16P, with five ½’s (8). But none of the solutions had exactly a uniform probability of 1/8 to reach all endpoints. Success (with exact 1/8’s and a denominator of 4) was met with the new target

(1+,1*sum(!(2*P==1)))*(.01+sum(!(8*evol(P)[7,]==1)))

imposing precisely 1/8 on the final line. With a solution with 11 ½’s

0.5
1.0 0.0
1.0 0.0 0.0
1.0 0.5 1.0 0.5
0.5 0.5 1.0 0.0 0.0
1.0 0.0 0.5 0.0 0.5 0.0
0.5 0.5 0.5 1.0 1.0 1.0 0.5

and another one with 12 ½’s:

0.5
1.0 0.0
1.0 .375 0.0
1.0 1.0 .625 0.5
0.5  0.5  0.5  0.5  0.0
1.0  0.0  0.5  0.5  0.0  0.5
0.5  1.0  0.5  0.0  1.0  0.5  0.0

Incidentally, Michael Proschan and my good friend Jeff Rosenthal have an 2009 American Statistician paper on another modification of the quincunx they call the uncunx! Playing a wee bit further with the annealing, and using a denominator of 840 let to a 60P  with 13 ½’s out of 28

30
60 0
60 1 0
30 30 30 0
30 30 30 30 30
60  60  60  0  60  0
60  30  0  30  30 60 30