## n-1,n,n+1, who [should] care?!

**T**erry Speed wrote a column in the latest IMS Bulletin *(the one I received a week ago)* about the choice of the denominator in the variance estimator. That is, should s² involve n (number of observations), n-1 (degrees of freedom), n+1 or anything else in its denominator? I find the question more interesting than the answer (sorry, Terry!) as it demonstrates quite forcibly that there is not a single possible choice for this estimator of the variance but that instead the “optimal” estimator is determined by the choice of the optimality criterion: this makes for a wonderful (if rather formal) playground for a class on decision theoretic statistics. And I often use it on my students. Non-Bayesian mathematical statistics courses often give the impression that there is a natural (single) estimator, when this estimator is based on an implicit choice of an optimality criterion. (This issue is illustrated in the books of Chang and of Vasishth and Broe I discussed earlier. As well as by the Stein effect, of course.) I thus deem it worthwhile to impress upon all users of statistics that there is no such single optimal choice, that unbiasedness is not a compulsory property—just as well since most parameters cannot be estimated in an unbiased manner!—, and that there is room for a subjective choice of a “best” estimator, as paradoxical as it may sound to non-statisticians.

February 6, 2013 at 4:53 pm

Bonjour Christian,

the use of the terminology population variance (divided by n) and sample variance (divided by n-1) used in many textbooks, and implicitly assumed in many packages or calculators is illogical (I view here the variance as a function of n numbers that does not change given the statistical context) and leads to a considerable amount of confusion indeed and many problems in the classroom. And it is something that I always avoided personally. Of course dividing by n-p -1 with p regressors in the basic linear model will also be the unbiased choice.

February 5, 2013 at 5:33 am

I completely agree that there are multiple criteria for an optimal estimator. But I wonder if there is any statistician who believes there is only one optimal choice. Not to mention Bayesianists, even a frequentist would consider at least two criteria, the bias and variance of an estimator, for the “optimal” consideration. In fact, even for biasness alone there are several criteria (unconditional and conditional bias), especially in sequential testing problems. Chang’s book does not imply that bias or other criteria should be the only criterion. However, it does use the fact of a biased MLE to question the interpretation of likelihood as the relative plausibility of various values of the parameter, suggesting a necessity of looking into the meaning of likelihood further, especially in the MLE is biased.

February 5, 2013 at 7:52 am

Sorry Mark if I repeat myself: the MLE is almost always biased. “Almost always” is understood in the sense that for almost every parameterisation of the distribution, there is no unbiased estimator of the corresponding parameter and hence the MLE cannot be unbiased. See e.g. Lehmann and Casella (section 8.4, p.144). If this does not seem intuitive enough, think of the standard deviation in the normal model: there is no unbiased estimator of this quantity…