Today was my last Reading Seminar class and the concluding paper chosen by the student was Tukey’s “The future of data analysis“, a 1962 Annals of Math. Stat. paper. Unfortunately, reading this paper required much more maturity and background than the student could afford, which is the reason why this last presentation is not posted on this page… Given the global and a-theoretical perspective of the paper, it was quite difficult to interpret without further delving into Tukey’s work and without a proper knowledge of what was Data Analysis in the 1960′s. (The love affair of French statisticians with data analysis was then at its apex, but it has very much receded since then!) Being myself unfamiliar with this paper, and judging mostly from the sentences pasted by the student in his slides, I cannot tell how much of the paper is truly visionary and how much is cheap talk: focussing on trimmed and winsorized means does not sound like offering a very wide scope for data analysis… I liked the quote “It’s easier to carry a slide rule than a desk computer, to say nothing of a large computer”! (As well as the quote from Azimov “The sound of panting“…. (Still, I am unsure I will keep the paper within the list next year!)
Overall, despite a rather disappointing lower tail of the distribution of the talks, I am very happy with the way the seminar proceeded this year and the efforts produced by the students to assimilate the papers, the necessary presentation skills including building a background in LaTeX and Beamer for most students. I thus think almost all students will pass this course and do hope those skills will be profitable for their future studies…
Today’s classics seminar was rather special as two students were scheduled to talk. It was even more special as both students had picked (without informing me) the very same article by Berger and Sellke (1987), Testing a point-null hypothesis: the irreconcilability of p-values and evidence, on the (deep?) discrepancies between frequentist p-values and Bayesian posterior probabilities. In connection with the Lindley-Jeffreys paradox. Here are Amira Mziou’s slides:
and Jiahuan Li’s slides:
It was a good exercise to listen to both talks, seeing two perspectives on the same paper, and I hope the students in the class got the idea(s) behind the paper. As you can see, there were obviously repetitions between the talks, including the presentation of the lower bounds for all classes considered by Jim Berger and Tom Sellke, and the overall motivation for the comparison. Maybe as a consequence of my criticisms on the previous talk, both Amira and Jiahuan put some stress on the definitions to formally define the background of the paper. (I love the poetic line: “To prevent having a non-Bayesian reality”, although I am not sure what Amira meant by this…)
I like the connection made therein with the Lindley-Jeffreys paradox since this is the core idea behind the paper. And because I am currently writing a note about the paradox. Obviously, it was hard for the students to take a more remote stand on the reason for the comparison, from questioning .the relevance of testing point null hypotheses and of comparing the numerical values of a p-value with a posterior probability, to expecting asymptotic agreement between a p-value and a Bayes factor when both are convergent quantities, to setting the same weight on both hypotheses, to the ad-hocquery of using a drift on one to equate the p-value with the Bayes factor, to use specific priors like Jeffreys’s (which has the nice feature that it corresponds to g=n in the g-prior, as discussed in the new edition of Bayesian Core). The students also failed to remark on the fact that the developments were only for real parameters, as the phenomenon (that the lower bound on the posterior probabilities is larger than the p-value) does not happen so universally in larger dimensions. I would have expected more discussion from the ground, but we still got good questions and comments on a) why 0.05 matters and b) why comparing p-values and posterior probabilities is relevant. The next paper to be discussed will be Tukey’s piece on the future of statistics.
In today’s classics seminar, my student Bassoum Abou presented the 1981 paper written by Charles Stein for the Annals of Statistics, Estimating the mean of a normal distribution, recapitulating the advances he made on Stein estimators, minimaxity and his unbiased estimator of risk. Unfortunately; this student missed a lot about paper and did not introduce the necessary background…So I am unsure at how much the class got from this great paper… Here are his slides (watch out for typos!)
Historically, this paper is important as this is one of the very few papers published by Charles Stein in a major statistics journal, the other publications being made in conference proceedings. It contains the derivation of the unbiased estimator of the loss, along with comparisons with posterior expected loss.