Archive for Significance

retire statistical significance [follow-up]

Posted in Statistics with tags , , , , , , , , , , , , , , on December 9, 2019 by xi'an

[Here is a brief update sent by my coauthors Valentin, Sander, and Blake on events following the Nature comment “Retire Statistical Significance“.]

In the eight months since publication of the comment and of the special issue of The American Statistician, we are glad to see a rich discussion on internet blogs and in scholarly publications and popular media.Nature

One important indication of change is that since March numerous scientific journals have published editorials or revised their author guidelines. We have selected eight editorials that not only discuss statistics reform but give concrete new guidelines to authors. As you will see, the journals differ in how far they want to go with the reform (all but one of the following links are open access).

1) The New England Journal of Medicine, “New Guidelines for Statistical Reporting in the Journal

2) Pediatric Anesthesia, “Embracing uncertainty: The days of statistical significance are numbered

3) Journal of Obstetric, Gynecologic & Neonatal Nursing, “The Push to Move Health Care Science Beyond p < .05

4) Brain and Neuroscience Advances, “Promoting and supporting credibility in neuroscience

5) Journal of Wildlife Management, “Vexing Vocabulary in Submissions to the Journal of Wildlife Management”

6) Demographic Research, “P-values, theory, replicability, and rigour

7) Journal of Bone and Mineral Research, “New Guidelines for Data Reporting and Statistical Analysis: Helping Authors With Transparency and Rigor in Research

8) Significance, “The S word … and what to do about it

Further, some of you took part in a survey by Tom Hardwicke and John Ioannidis that was published in the European Journal of Clinical Investigation along with editorials by Andrew Gelman and Deborah Mayo.

We replied with a short commentary in that journal, “Statistical Significance Gives Bias a Free Pass

And finally, joining with the American Statistical Association (ASA), the National Institute of Statistical Sciences (NISS) in the United States has also taken up the reform issue.

a statistic with consequences

Posted in pictures, Statistics with tags , , , , , , , on July 18, 2019 by xi'an

In the latest Significance, there was a flyer with some members updates, an important one being that Sylvia Richardson had been elected the next president of the Royal Statistical Society. Congratulations to my friend Sylvia! Another item was that the publication of the 2018 RSS Statistic of the Year has led an Australian water company to switch from plastic to aluminum. Hmm, what about switching to nothing and supporting a use-your-own bottle approach? While it is correct that aluminum cans can be 100% made of recycled aluminum, this water company does not seem to appear to make any concerted effort to ensure its can are made of recycled aluminum or to increase the recycling rate for aluminum in Australia towards achieving those of Brazil (92%) or Japan (86%). (Another shocking statistic that could have been added to the 90.5% non-recycled plastic waste [in the World?] is that a water bottle consumes the equivalent of one-fourth of its contents in oil to produce.) Another US water company still promotes water bottles as one of the most effective and inert carbon capture & sequestration methods”..! There is no boundary for green-washing.

impossible estimation

Posted in Books, Statistics with tags , , , , , , , , , , , on January 29, 2018 by xi'an

Outside its Sciences & Médecine section that I most often read, Le Monde published last weekend a tribune by the anthropologist Michel Naepels [who kindly replied to my email on his column] on the impossibility to evaluate the number of deaths in Congo due to the political instability (a weak and undemocratic state fighting armed rebel groups), for lack of a reliable sample. With a huge gap between two estimations of this number, from 200,000 to 5.4 million excess deaths. In the later, IRC states that “only 0.4 percent of all deaths across DR Congo were attributed directly to violence”. Still, diverging estimates do not mean numbers are impossible to produce, just that more elaborate methods like those developed by Rebecca Steorts for Syrian deaths must be investigated. Which requires more means than those available to the local States (assuming they are interested in the answer) or to NGOs. This also raises the question whether or not “excess deaths” has an absolute meaning, since it refers to an hypothetical state of the world that has not taken place.

On the same page, another article written by geographers shed doubt on predictive policing software, not used in France, if not so clearly as in the Significance article by Kristian Lum and William Isaac last year.

p-values and decision-making [reposted]

Posted in Books, Statistics, University life with tags , , , , , , , , , , on August 30, 2017 by xi'an

In a letter to Significance about a review of Robert Matthews’s book, Chancing it, Nicholas Longford recalls a few basic facts about p-values and decision-making earlier made by Dennis Lindley in Making Decisions. Here are some excerpts, worth repeating in the light of the 0.005 proposal:

“A statement of significance based on a p-value is a verdict that is oblivious to consequences. In my view, this disqualifies hypothesis testing, and p-values with it, from making rational decisions. Of course, the p-value could be supplemented by considerations of these consequences, although this is rarely done in a transparent manner. However, the two-step procedure of calculating the p-value and then incorporating the consequences is unlikely to match in its integrity the single-stage procedure in which we compare the expected losses associated with the two contemplated options.”

“At present, [Lindley’s] decision-theoretical approach is difficult to implement in practice. This is not because of any computational complexity or some problematic assumptions, but because of our collective reluctance to inquire about the consequences – about our clients’ priorities, remits and value judgements. Instead, we promote a culture of “objective” analysis, epitomised by the 5% threshold in significance testing. It corresponds to a particular balance of consequences, which may or may not mirror our clients’ perspective.”

“The p-value and statistical significance are at best half-baked products in the process of making decisions, and a distraction at worst, because the ultimate conclusion of a statistical analysis should be a proposal for what to do next in our clients’ or our own research, business, production or some other agenda. Let’s reflect and admit how frequently we abuse hypothesis testing by adopting (sometimes by stealth) the null hypothesis when we fail to reject it, and therefore do so without any evidence to support it. How frequently we report, or are party to reporting, the results of hypothesis tests selectively. The problem is not with our failing to adhere to the convoluted strictures of a popular method, but with the method itself. In the 1950s, it was a great statistical invention, and its popularisation later on a great scientific success. Alas, decades later, it is rather out of date, like the steam engine. It is poorly suited to the demands of modern science, business, and society in general, in which the budget and pocketbook are important factors.”

ABC at sea and at war

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , on July 18, 2017 by xi'an

While preparing crêpes at home yesterday night, I browsed through the  most recent issue of Significance and among many goodies, I spotted an article by McKay and co-authors discussing the simulation of a British vs. German naval battle from the First World War I had never heard of, the Battle of the Dogger Bank. The article was illustrated by a few historical pictures, but I quickly came across a more statistical description of the problem, which was not about creating wargames and alternate realities but rather inferring about the likelihood of the actual income, i.e., whether or not the naval battle outcome [which could be seen as a British victory, ending up with 0 to 1 sunk boat] was either a lucky strike or to be expected. And the method behind solving this question was indeed both Bayesian and ABC-esque! I did not read the longer paper by McKay et al. (hard to do while flipping crêpes!) but the description in Significance was clear enough to understand that the six summary statistics used in this ABC implementation were the number of shots, hits, and lost turrets for both sides. (The answer to the original question is that indeed the British fleet was lucky to keep all its boats afloat. But it is also unlikely another score would have changed the outcome of WWI.) [As I found in this other history paper, ABC seems quite popular in historical inference! And there is another completely unrelated arXived paper with main title The Fog of War…]