**A**n interesting question on X validated reminded me of the epiphany I had some twenty years ago when reading a Annals of Statistics paper by Anirban Das Gupta and Bill Strawderman on shrinkage estimators, namely that some estimators shared the same risk function, meaning their integrated loss was the same for all values of the parameter. As indicated in this question, Stefan‘s instructor seems to believe that two estimators having the same risk function must be a.s. identical. Which is not true as exemplified by the James-Stein (1960) estimator with scale 2(p-2), which has constant risk p, just like the maximum likelihood estimator. I presume the confusion stemmed from the concept of *completeness*, where having a function with constant expectation under all values of the parameter implies that this function is constant. But, for loss functions, the concept does not apply since the loss depends both on the observation (that is complete in a Normal model) and on the parameter.

## Archive for Annals of Statistics

## same risk, different estimators

Posted in Statistics with tags Annals of Statistics, complete statistics, hierarchical Bayesian modelling, Jim Berger, shrinkage estimation, William Strawderman on November 10, 2017 by xi'an## Florid’AISTATS

Posted in pictures, R, Statistics, Travel, University life with tags AISTATS 2016, AISTATS 2017, Annals of Statistics, Cadiz, Electronic Journal of Statistics, Florida, machine learning, NIPS 2017, proceedings, refereeing on August 31, 2016 by xi'an**T**he next AISTATS conference is taking place in Florida, Fort Lauderdale, on April 20-22. (The website keeps the same address one conference after another, which means all my links to the AISTATS 2016 conference in Cadiz are no longer valid. And that the above sunset from Florida is named… cadiz.jpg!) The deadline for paper submission is October 13 and there are two novel features:

**Fast-track for Electronic Journal of Statistics**: Authors of a small number of accepted papers will be invited to submit an extended version for fast-track publication in a special issue of the Electronic Journal of Statistics (EJS) after the AISTATS decisions are out. Details on how to prepare such extended journal paper submission will be announced after the AISTATS decisions.**Review-sharing with NIPS**: Papers previously submitted to NIPS 2016 are required to declare their previous NIPS paper ID, and optionally supply a one-page letter of revision (similar to a revision letter to journal editors; anonymized) in supplemental materials. AISTATS reviewers will have access to the previous anonymous NIPS reviews. Other than this, all submissions will be treated equally.

I find both initiatives worth applauding and replicating in other machine-learning conferences. Particularly in regard with the recent debate we had at Annals of Statistics.

## what to do with refereed conference proceedings?

Posted in Books, Statistics, University life with tags AISTATS 2016, Annals of Statistics, machine learning, NIPS 2015, proceedings, publication, refereeing on August 8, 2016 by xi'an**I**n the recent days, we have had a lively discussion among AEs of the Annals of Statistics, as to whether or not set up a policy regarding publications of documents that have already been published in a shortened (8 pages) version in a machine learning conference like NIPS. Or AISTATS. While I obviously cannot disclose details here, the debate is quite interesting and may bring the machine learning and statistics communities closer if resolved in a certain way. My own and personal opinion on that matter is that what matters most is what’s best for Annals of Statistics rather than the authors’ tenure or the different standards in the machine learning community. If the submitted paper is based on a brilliant and novel idea that can appeal to a sufficiently wide part of the readership and if the maths support of that idea is strong enough, we should publish the paper. Whether or not an eight-page preliminary version has been previously published in a conference proceeding like NIPS does not seem particularly relevant to me, as I find those short papers mostly unreadable and hence do not read them. Since Annals of Statistics runs an anti-plagiarism software that is most likely efficient, blatant cases of duplications could be avoided. Of course, this does not solve all issues and papers with similar contents can and will end up being published. However, this is also the case for statistics journals and statistics, in the sense that brilliant ideas sometimes end up being split between two or three major journals.

## making a random walk geometrically ergodic

Posted in R, Statistics with tags Annals of Statistics, CRAN, geometric ergodicity, métro, MCMC, Metropolis-Hastings, R, R package, random walk, uniform ergodicity on March 2, 2013 by xi'an**W**hile a random walk Metropolis-Hastings algorithm cannot be uniformly ergodic in a general setting (Mengersen and Tweedie, *AoS*, 1996), because it needs more energy to leave far away starting points, it can be geometrically ergodic depending on the target (and the proposal). In a recent *Annals of Statistics* paper, Leif Johnson and Charlie Geyer designed a trick to turn a random walk Metropolis-Hastings algorithm into a geometrically ergodic random walk Metropolis-Hastings algorithm by virtue of an isotropic transform (under the provision that the original target density has a moment generating function). This theoretical result is complemented by an R package called mcmc. (I have not tested it so far, having read the paper in the métro.) The examples included in the paper are however fairly academic and I wonder how the method performs in practice, on truly complex models, in particular because the change of variables relies on (a) an origin and (b) changing the curvature of space uniformly in all dimensions. Nonetheless, the idea is attractive and reminds me of a project of ours with Randal Douc, started thanks to the ‘Og and still under completion.

## lemma 7.3

Posted in Statistics with tags Annals of Statistics, book reviews, CHANCE, ergodicity, George Casella, Harris recurrence, irreducibility, Luke Tierney, Markov chains, MCMC, Monte Carlo Statistical Methods, Xiao-Li Meng on November 14, 2012 by xi'an**A**s Xiao-Li Meng accepted to review—and I am quite grateful he managed to fit this review in an already overflowing deanesque schedule!— our 2004 book *Monte Carlo Statistical Methods* as part of a special book review issue of CHANCE honouring the memory of George thru his books—thanks to Sam Behseta for suggesting this!—, he sent me the following email about one of our proofs—demonstrating how much efforts he had put into this review!—:

I however have a question about the proof of Lemma 7.3 on page 273. After the expression of E[h(x^(1)|x_0], the proof stated "and substitute Eh(x) for h(x_1)". I cannot think of any justification for this substitution, given the whole purpose is to show h(x) is a constant.

**I** put it on hold for a while and only looked at it in the (long) flight to Chicago. Lemma 7.3 in *Monte Carlo Statistical Methods* is the result that the Metropolis-Hastings algorithm is Harris recurrent (and not only recurrent). The proof is based on the characterisation of Harris recurrence as having only constants for harmonic functions, i.e. those satisfying the identity

The chain being recurrent, the above implies that harmonic functions are almost everywhere constant and the proof steps from almost everywhere to everywhere. The fact that the substitution above—and I also stumbled upon that very subtlety when re-reading the proof in my plane seat!—is valid is due to the fact that it occurs within an integral: despite sounding like using the result to prove the result, the argument is thus valid! Needless to say, we did not invent this (elegant) proof but took it from one of the early works on the theory of Metropolis-Hastings algorithms, presumably Luke Tierney’s foundational Annals paper work that we should have quoted…

**A**s pointed out by Xiao-Li, the proof is also confusing for the use of two notations for the expectation (one of which is indexed by *f* and the other corresponding to the Markov transition) and for the change in the meaning of f, now the stationary density, when compared with Theorem 6.80.

## improper priors, incorporated

Posted in Books, Statistics, University life with tags Annals of Statistics, Bayes factor, Bayes theorem, countable measure, empirical Bayes methods, improper prior, marginalisation paradoxes, Poisson point process, random set on January 11, 2012 by xi'an“

If a statistical procedure is to be judged by a criterion such as a conventional loss function (…) we should not expect optimal results from a probabilistic theory that demands multiple observations and multiple parameters.” P. McCullagh & H. Han

**P**eter McCullagh and Han Han have just published in the Annals of Statistics a paper on *Bayes’ theorem for improper mixtures*. This is a fascinating piece of work, even though some parts do elude me… The authors indeed propose a framework based on Kingman’s Poisson point processes that allow to include (countable) improper priors in a coherent probabilistic framework. This framework requires the definition of a test set A in the sampling space, the observations being then the events Y∩ A, Y being an infinite random set when the prior is infinite. It is therefore complicated to perceive this representation in a genuine Bayesian framework, i.e. for a single observation, corresponding to a single parameter value. In that sense it seems closer to the original empirical Bayes, *à la* Robbins.

“

An improper mixture is designed for a generic class of problems, not necessarily related to one another scientifically, but all having the same mathematical structure.” P. McCullagh & H. Han

**T**he paper thus misses in my opinion a clear link with the design of improper priors. And it does not offer a resolution of the improper prior Bayes factor conundrum. However, it provides a perfectly valid environment for working with improper priors. For instance, the final section on the marginalisation “paradoxes” is illuminating in this respect as it does not demand using a limit of proper priors.