efficiency and the Fréchet-Darmois-Cramèr-Rao bound

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , on February 4, 2019 by xi'an

Following some entries on X validated, and after grading a mathematical statistics exam involving Cramèr-Rao, or Fréchet-Darmois-Cramèr-Rao to include both French contributors pictured above, I wonder as usual at the relevance of a concept of efficiency outside [and even inside] the restricted case of unbiased estimators. The general (frequentist) version is that the variance of an estimator δ of [any transform of] θ with bias b(θ) is

I(θ)⁻¹ (1+b'(θ))²

while a Bayesian version is the van Trees inequality on the integrated squared error loss

(E(I(θ))+I(π))⁻¹

where I(θ) and I(π) are the Fisher information and the prior entropy, respectively. But this opens a whole can of worms, in my opinion since

• establishing that a given estimator is efficient requires computing both the bias and the variance of that estimator, not an easy task when considering a Bayes estimator or even the James-Stein estimator. I actually do not know if any of the estimators dominating the standard Normal mean estimator has been shown to be efficient (although there exist results for closed form expressions of the James-Stein estimator quadratic risk, including one of mine the Canadian Journal of Statistics published verbatim in 1988). Or is there a result that a Bayes estimator associated with the quadratic loss is by default efficient in either the first or second sense?
• while the initial Fréchet-Darmois-Cramèr-Rao bound is restricted to unbiased estimators (i.e., b(θ)≡0) and unable to produce efficient estimators in all settings but for the natural parameter in the setting of exponential families, moving to the general case means there exists one efficiency notion for every bias function b(θ), which makes the notion quite weak, while not necessarily producing efficient estimators anyway, the major impediment to taking this notion seriously;
• moving from the variance to the squared error loss is not more “natural” than using any [other] convex combination of variance and squared bias, creating a whole new class of optimalities (a grocery of cans of worms!);
• I never got into the van Trees inequality so cannot say much, except that the comparison between various priors is delicate since the integrated risks are against different parameter measures.

graphe, graphons, graphez !

Posted in Books, pictures, Statistics, University life with tags , , , , , , on December 3, 2018 by xi'an

Larry Brown (1940-2018)

Posted in Books, pictures, Statistics, University life with tags , , , , , , on February 21, 2018 by xi'an

Just learned a few minutes ago that my friend Larry Brown has passed away today, after fiercely fighting cancer till the end. My thoughts of shared loss and deep support first go to my friend Linda, his wife, and to their children. And to all their colleagues and friends at Wharton. I have know Larry for all of my career, from working on his papers during my PhD to being a temporary tenant in his Cornell University office in White Hall while he was mostly away in sabbatical during the academic year 1988-1989, and then periodically meeting with him in Cornell and then Wharton along the years. He and Linday were always unbelievably welcoming and I fondly remember many times at their place or in superb restaurants in Phillie and elsewhere.  And of course remembering just as fondly the many chats we had along these years about decision theory, admissibility, James-Stein estimation, and all aspects of mathematical statistics he loved and managed at an ethereal level of abstraction. His book on exponential families remains to this day one of the central books in my library, to which I kept referring on a regular basis… For certain, I will miss the friend and the scholar along the coming years, but keep returning to this book and have shared memories coming back to me as I will browse through its yellowed pages and typewriter style. Farewell, Larry, and thanks for everything!

exams

Posted in Kids, Statistics, University life with tags , , , , , , , on February 7, 2018 by xi'an
As in every term, here comes the painful week of grading hundreds of exams! My mathematical statistics exam was highly traditional and did not even involve Bayesian material, as the few students who attended the lectures were so eager to discuss sufficiency and ancilarity, that I decided to spend an extra lecture on these notions rather than rushing though conjugate priors. Highly traditional indeed with an inverse Gaussian model and a few basic consequences of Basu’s theorem. actually exposed during this lecture. Plus mostly standard multiple choices about maximum likelihood estimation and R programming… Among the major trends this year, I spotted out the widespread use of strange derivatives of negative powers, the simultaneous derivation of two incompatible convergent estimates, the common mixup between the inverse of a sum and the sum of the inverses, the inability to produce the MLE of a constant transform of the parameter, the choice of estimators depending on the parameter, and a lack of concern for Fisher informations equal to zero.

best unbiased estimators

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , , , , on January 18, 2018 by xi'an

A question that came out on X validated today kept me busy for most of the day! It relates to an earlier question on the best unbiased nature of a maximum likelihood estimator, to which I pointed out the simple case of the Normal variance when the estimate is not unbiased (but improves the mean square error). Here, the question is whether or not the maximum likelihood estimator of a location parameter, when corrected from its bias, is the best unbiased estimator (in the sense of the minimal variance). The question is quite interesting in that it links to the mathematical statistics of the 1950’s, of Charles Stein, Erich Lehmann, Henry Scheffé, and Debabrata Basu. For instance, if there exists a complete sufficient statistic for the problem, then there exists a best unbiased estimator of the location parameter, by virtue of the Lehmann-Scheffé theorem (it is also a consequence of Basu’s theorem). And the existence is pretty limited in that outside the two exponential families with location parameter, there is no other distribution meeting this condition, I believe. However, even if there is no complete sufficient statistic, there may still exist best unbiased estimators, as shown by Bondesson. But Lehmann and Scheffé in their magisterial 1950 Sankhya paper exhibit a counter-example, namely the U(θ-1,θ-1) distribution:

since no non-constant function of θ allows for a best unbiased estimator.

Looking in particular at the location parameter of a Cauchy distribution, I realised that the Pitman best equivariant estimator is unbiased as well [for all location problems] and hence dominates the (equivariant) maximum likelihood estimator which is unbiased in this symmetric case. However, as detailed in a nice paper of Gabriela Freue on this problem, I further discovered that there is no uniformly minimal variance estimator and no uniformly minimal variance unbiased estimator! (And that the Pitman estimator enjoys a closed form expression, as opposed to the maximum likelihood estimator.) This sounds a bit paradoxical but simply means that there exists different unbiased estimators which variance functions are not ordered and hence not comparable. Between them and with the variance of the Pitman estimator.