## Archive for replication crisis

## unrejected null [xkcd]

Posted in Statistics with tags evidence, Nature, point null hypotheses, preregistered experiments, replication crisis, xkcd on July 18, 2018 by xi'an## long journey to reproducible results [or not]

Posted in Books, pictures, Statistics, University life with tags ageing, Amy Cuddy, Nature, NYT, power pose, psychology, replication crisis, roundworms, Slate, The New York Times on November 17, 2017 by xi'an**A** rather fascinating article in Nature of last August [hidden under a pile of newspapers at home!]. By Gordon J. Lithgow, Monica Driscoll and Patrick Phillips. About their endeavours to explain for divergent outcomes in the replications [or lack thereof] of an earlier experiment on anti-aging drugs tested on roundworms. Rather than dismissing the failures or blaming the other teams, the above researchers engaged for four years (!) into the titanic and grubby task of understanding the reason(s) for such discrepancies.

Finding that once most causes for discrepancies (like gentle versus rough lab technicians!) were eliminated, there were still two “types” of worms, those short-lived and those long-lived, for reasons yet unclear. “We need to repeat more experiments than we realized” is a welcome conclusion to this dedicated endeavour, worth repeating in different circles. And apparently missing in the NYT coverage by Susan Dominus of the story of Amy Cuddy, a psychologist at the origin of the “power pose” theory that got later disputed for lack of reproducibility. Article which main ideological theme is that Cuddy got singled-out in the replication crisis because she is a woman and because her “power pose” theory is towards empowering women and minorities. Rather than because she keeps delivering the same message, mostly outside academia, despite the lack of evidence and statistical backup. (Dominus’ criticisms of psychologists with “an unusual interest in statistics” and of Andrew’s repeated comments on the methodological flaws of the 2010 paper that started all are thus particularly unfair. A Slate article published after the NYT coverage presents an alternative analysis of this affair. Andrew also posted on Dominus paper, with a subsequent humongous trail of comments!)

## no publication without confirmation

Posted in Books, Statistics, University life with tags clinical trials, Nature, p-values, replication crisis on March 15, 2017 by xi'an

“Our proposal is a new type of paper for animal studies (…) that incorporates an independent, statistically rigorous confirmation of a researcher’s central hypothesis.” (p.409)

**A** comment tribune in Nature of Feb 23, 2017, suggests running clinical trials in three stages towards meeting higher standards in statistical validation. The idea is to impose a preclinical trial run by an independent team following an initial research showing some potential for some new treatment. The three stages are thus (i) to generate hypotheses; (ii) to test hypotheses; (iii) to test broader application of hypotheses (p.410). While I am skeptical of the chances of this proposal reaching adoption (for various reasons, like, what would the incentive of the second team be [of the B team be?!], especially if the hypothesis is dis-proved, how would both teams share the authorship and presumably patenting rights of the final study?, and how could independence be certain were the B team contracted by the A team?), the statistical arguments put forward in the tribune are rather weak (in my opinion). Repeating experiments with a larger sample size and an hypothesis set a priori rather than cherry-picked is obviously positive, but moving from a p-value boundary of 0.05 to one of 0.01 and to a power of 80% is more a cosmetic than a foundational change. As Andrew and I pointed out in our PNAS discussion of Johnson two years ago.

“the earlier experiments would not need to be held to the same rigid standards.” (p.410)

The article contains a vignette on “the maths of predictive value” that makes intuitive sense but only superficially. First, “the positive predictive value is the probability that a positive result is truly positive” (p.411) A statement that implies a distribution of probability on the space of hypotheses, although I see no Bayesian hint throughout the paper. Second, this (ersatz of a) probability is computed by a ratio of the number of positive results under the hypothesis over the total number of positive results. Which does not make much sense outside a Bayesian framework and even then cannot be assessed experimentally or by simulation without defining a distribution of the output under both hypotheses. Simplistic pictures are the above are not necessarily meaningful. And Nature should certainly invest into a statistical editor!

## It’s the selection’s fault not the p-values’… [seminar]

Posted in pictures, Statistics, University life with tags Benjamini, false discovery rate, Jussieu, p-values, replication crisis, seminar, Université Pierre et Marie Curie on February 5, 2016 by xi'anYoav Benjamini will give a seminar talk in Paris next Monday on the above (full title: “*The replicability crisis in science: It’s the selection’s fault not the p-values’*“). (That I will miss for being in Warwick at the time.) With a fairly terse abstract:

I shall discuss the problem of lack of replicability of results in science, and point at selective inference as a statistical root cause. I shall then present a few strategies for addressing selective inference, and their application in genomics, brain research and earlier phases of clinical trials where both primary and secondary endpoints are being used.

**Details:** February 8, 2016, 16h, Université Pierre & Marie Curie, campus Jussieu, salle 15-16-101.