**J**ust read the announcement that my friend (and former colleague at Warwick U) Mark Girolami became the Chief Scientist at The Alan Turing Institute, joining forces with Adrian Smith, currently Director and Chief Executive of the Turing Institute, into a Bayesian leadership!

## Archive for Bayesian Choice

## Bayesians at the helm!

Posted in pictures, Statistics, University life with tags Alan Turing, Alan Turing Institute, Bayesian Choice, London, Royal Statistical Society, RSS, United Kingdom, University of Cambridge, University of Warwick on October 10, 2021 by xi'an## statistical illiteracy

Posted in Statistics with tags Bayesian Choice, Bruno de Finetti, COVID-19, Luminy, not a book review, pandemic, Poisson point process, Poisson process, public opinion, statistical illiteracy, subjective probability, The Guardian on October 27, 2020 by xi'an**A**n opinion tribune in the Guardian today about the importance of statistical literacy in these COVIdays, entitled “Statistical illiteracy isn’t a niche problem. During a pandemic, it can be fatal“, by Carlo Rovelli (a physics professor on Luminy campus) which, while well-intended, is not particularly helping. For instance, the tribune starts with a story of a cluster of a rare disease happening in a lab along with the warning that [Poisson] clusters also occur with uniform sampling. But.. being knowledgeable about the Poisson process may help in reducing the psychological stress within the lab only if the cluster size is compatible with the prevalence of the disease in the neighbourhood. Obviously, a poor understanding of randomness and statistical tools has not help with the handling of the pandemics by politicians, decision-makers, civil servants and doctors (although I would have added the fundamental misconception about scientific models which led most people to confuse the map with the territory and later cry wolf…)

Rovelli also cites Bruno de Finetti as “the key to understanding probability”, as a representation of one’s beliefs rather than a real thing. While I agree with this Bayesian perspective, I am unsure it will percolate well enough with the Guardian audience. And bring more confidence in the statistical statements made by experts…

It is only when I finished reading the column that I realised it was adapted from a book soon to appear by the author. And felt slightly cheated. [Obviously, I did not read it so this is NOT a book review!]

## a glaringly long explanation

Posted in Statistics with tags ABC, Bayesian Choice, cross validated, exponential families, proof, socks, sufficient statistics, teaching, Thomas Bayes' portrait, undergraduates, Université Paris Dauphine on December 19, 2018 by xi'an**I**t is funny that, when I am teaching the rudiments of Bayesian statistics to my undergraduate students in Paris-Dauphine, including ABC via Rasmus’ socks, specific questions about the book (The Bayesian Choice) start popping up on X validated! Last week was about the proof that ABC is exact when the tolerance is zero. And the summary statistic sufficient.

This week is about conjugate distributions for exponential families (not that there are many others!). Which led me to explain both the validation of the conjugacy and the derivation of the posterior expectation of the mean of the natural sufficient statistic in far more details than in the book itself. Hopefully in a profitable way.

## my book available for a mere $1,091.50

Posted in Books with tags Amazon, Bayesian Choice, book sales, textbook on May 1, 2016 by xi'an**A**s I was looking at a link to my Bayesian Choice book on Amazon, I found that one site offered it for the modest sum of $1,091.50, a very slight increase when compared with the reference price of $59.95… I do wonder at the reason (scam?) behind this offer as such a large price is unlikely to attract any potential buyer to the site. (Obviously, if *you* are interested by this price, feel free to contact me!)

## the philosophical importance of Stein’s paradox [a reply from the authors]

Posted in Books, pictures, Statistics, University life with tags Bayesian Analysis, Bayesian Choice, Charles Stein, decision theory, frequentist inference, James-Stein estimator, loss functions, philosophy of sciences, Stein effect, Stein's phenomenon, Stephen Stigler on January 15, 2016 by xi'an*[In the wake of my comment on this paper written by three philosophers of Science, I received this reply from Olav Vassend.]*

Thank you for reading our paper and discussing it on your blog! Our purpose with the paper was to give an introduction to Stein’s phenomenon for a philosophical audience; it was not meant to — and probably will not — offer a new and interesting perspective for a statistician who is already very familiar with Stein’s phenomenon and its extensive literature.

I have a few more specific comments:

1. We don’t rechristen Stein’s phenomenon as “holistic pragmatism.” Rather, holistic pragmatism is the attitude to frequentist estimation that we think is underwritten by Stein’s phenomenon. Since MLE is sometimes admissible and sometimes not, depending on the number of parameters estimated, the researcher has to take into account his or her goals (whether total accuracy or individual-parameter accuracy is more important) when picking an estimator. To a statistician, this might sound obvious, but to philosophers it’s a pretty radical idea.

2.* “The part connecting Stein with Bayes again starts on the wrong foot, since it is untrue that any shrinkage estimator can be expressed as a Bayes posterior mean. This is not even true for the original James-Stein estimator, i.e., it is not a Bayes estimator and cannot be a Bayes posterior mean.”*

That seems to depend on what you mean by a “Bayes estimator.” It is possible to have an empirical Bayes prior (constructed from the sample) whose posterior mean is identical to the original James-Stein estimator. But if you don’t count empirical Bayes priors as Bayesian, then you are right.

3. *“And to state that improper priors “integrate to a number larger than 1” and that “it’s not possible to be more than 100% confident in anything”… And to confuse the Likelihood Principle with the prohibition of data dependent priors. And to consider that the MLE and any shrinkage estimator have the same expected utility under a flat prior (since, if they had, there would be no Bayes estimator!).”*

I’m not sure I completely understand your criticisms here. First, as for the relation between the LP and data-dependent priors — it does seem to me that the LP precludes the use of data-dependent priors. If you use data from an experiment to construct your prior, then — contrary to the LP — it will not be true that all the information provided by the experiment regarding which parameter is true is contained in the likelihood function, since some of the information provided by the experiment will also be in your prior.

Second, as to our claim that the ML estimator has the same expected utility (under the flat prior) as a shrinkage prior that it is dominated by—we incorporated this claim into our paper because it was an objection made by a statistician who read and commented on our paper. Are you saying the claim is false? If so, we would certainly like to know so that we can revise the paper to make it more accurate.

4. I was aware of Rubin’s idea that priors and utility functions (supposedly) are non-separable, but I didn’t (and don’t) quite see the relevance of that idea to Stein estimation.

5. *“Similarly, very little of substance can be found about empirical Bayes estimation and its philosophical foundations.”*

What we say about empirical Bayes priors is that they cannot be interpreted as degrees of belief; they are just tools. It will be surprising to many philosophers that priors are sometimes used in such an instrumentalist fashion in statistics.

6. The reason why we made a comparison between Stein estimation and AIC was two-fold: (a) for sociological reasons, philosophers are much more familiar with model selection than they are with, say, the LASSO or other regularized regression methods. (b) To us, it’s precisely because model selection and estimation are such different enterprises that it’s interesting that they have such a deep connection: despite being very different, AIC and shrinkage both rely on a bias-variance trade-off.

7. *“I also object to the envisioned possibility of a shrinkage estimator that would improve every component of the MLE (in a uniform sense) as it contradicts the admissibility of the single component MLE!”*

I don’t think our suggestion here contradicts the admissibility of single component MLE. The idea is just that if we have data D and D’ about parameters φ and φ’, then the estimates of both φ and φ’ can sometimes be improved if the estimation problems are lumped together and a shrinkage estimator is used. This doesn’t contradict the admissibility of MLE, because MLE is still admissible on each of the data sets for each of the parameters.

Again, thanks for reading the paper and for the feedback—we really do want to make sure our paper is accurate, so your feedback is much appreciated. Lastly, I apologize for the length of this comment.

Olav Vassend