One statistical analysis must not rule them all

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on May 31, 2022 by xi'an

E.J. (Wagenmakers), along with co-authors, published a (long) comment in Nature, rewarded by a illustration by David Parkins! About the over-confidence often carried by (single) statistical analyses, meaning a call for the comparison below different datasets, different models, and different techniques (beyond different teams).

“To gauge the robustness of their conclusions, researchers should subject the data to multiple analyses; ideally, these would be carried out by one or more independent teams. We understand that this is a big shift in how science is done, that appropriate infrastructure and incentives are not yet in place, and that many researchers will recoil at the idea as being burdensome and impractical. Nonetheless, we argue that the benefits of broader, more-diverse approaches to statistical inference could be so consequential that it is imperative to consider how they might be made routine.”

If COVID-19 had one impact on the general public perception of modelling, it is that, to quote Alfred Korzybski, the map is not the territory, i.e., the model is not reality. Hence, the outcome of a model-based analysis, including its uncertainty assessment, depends on the chosen model. And does not include the bias due to this choice. Which is much more complex to ascertain in a sort of things that we do not know we do not know paradigm…. In other words, while we know that all models are wrong, we do not know how much wrong each model is. Except that they disagree with one another in experiments like the above.

“Less understood is how restricting analyses to a single technique effectively blinds researchers to an important aspect of uncertainty, making results seem more precise than they really are.”

The difficulty with E.J.’s proposal is to set a framework for a range of statistical analyses. To which extent should one seek a different model or a different analysis? How can we weight the multiple analyses? Which probabilistic meaning can we attach to the uncertainty between analyses? How quickly will opportunistic researchers learn to play against the house and pretend at objectivity? Isn’t statistical inference already equipped to handle multiple models?

statistics for making decisions [book review]

Posted in Statistics, Books with tags , , , , , , , , , , , , on March 7, 2022 by xi'an

I bought this book [or more precisely received it from CRC Press as a ({prospective} book) review reward] as I was interested in the author’s perspectives on actual decision making (and unaware of the earlier Statistical Decision Theory book he had written in 2013). It is intended for a postgraduate semester course and  “not for a beginner in statistics”. Exercises with solutions are included in each chapter (with some R codes in the solutions). From Chapter 4 onwards, the “Further reading suggestions” are primarily referring to papers and books written by the author, as these chapters are based on his earlier papers.

“I regard hypothesis testing as a distraction from and a barrier to good statistical practice. Its ritualised application should be resisted from the position of strength, by being well acquainted with all its theoretical and practical aspects. I very much hope (…) that the right place for hypothesis testing is in a museum, next to the steam engine.”

The first chapter exposes the shortcomings of hypothesis testing for conducting decision making, in particular by ignoring the consequences of the decisions. A perspective with which I agree, but I fear the subsequent developments found in the book remain too formalised to be appealing, reverting to the over-simplification found in Neyman-Pearson theory. The second chapter is somewhat superfluous for a book assuming a prior exposure to statistics, with a quick exposition of the frequentist, Bayesian, and … fiducial paradigms. With estimators being first defined without referring to a specific loss function. And I find the presentation of the fiducial approach rather shaky (if usual). Esp. when considering fiducial perspective to be used as default Bayes in the subsequent chapters. I also do not understand the notation (p.31)

$P(\hat\theta

outside of a Bayesian (or fiducial?) framework. (I did not spot typos aside from the traditional “the the” duplicates, with at least six occurences!)

The aforementioned subsequent chapters are not particularly enticing as they cater to artificial loss functions and engage into detailed derivations that do not seem essential. At times they appear to be nothing more than simple calculus exercises. The very construction of the loss function, which I deem critical to implement statistical decision theory, is mostly bypassed. The overall setting is also frighteningly unidimensional. In the parameter, in the statistic, and in the decision. Covariates only appear in the final chapter which appears to have very little connection with decision making in that the loss function there is the standard quadratic loss, used to achieve the optimal composition of estimators, rather than selecting the best model. The book is also missing in practical or realistic illustrations.

“With a bit of immodesty and a tinge of obsession, I would like to refer to the principal theme of this book as a paradigm, ascribing to it as much importance and distinction as to the frequentist and Bayesian paradigms”

The book concludes with a short postscript (pp.247-249) reproducing the introducing paragraphs about the ill-suited nature of hypothesis testing for decision-making. Which would have been better supported by a stronger engagement into elicitating loss functions and quantifying the consequences of actions from the clients…

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]