When I received this book, Handbook of fitting statistical distributions with R, by Z. Karian and E.J. Dudewicz, from/for the Short Book Reviews section of the International Statistical Review, I was obviously impressed by its size (around 1700 pages and 3 kilos…). From briefly glancing at the table of contents, and the list of standard distributions appearing as subsections of the first chapters, I thought that the authors were covering different estimation/fitting techniques for most of the standard distributions. After taking a closer look at the book, I think the cover is misleading in several aspects: this is not a handbook (a.k.a. a reference book), it does not cover standard statistical distributions, the R input is marginal, and the authors only wrote part of the book, since about half of the chapters are written by other authors…
Archive for maximum likelihood estimation
Following some discussions I had last week at Banff about data cloning, I re-read the 2007 “Data cloning” paper published in Ecology Letters by Lele, Dennis, and Lutscher. Once again, I see a strong similarity with our 2002 Statistics and Computing SAME algorithm, as well as with the subsequent (and equally similar) “A multiple-imputation Metropolis version of the EM algorithm” published in Biometrika by Gaetan and Yao in 2003—Biometrika to which Arnaud and I had earlier and unsuccessfully submitted this unpublished technical report on the convergence of the SAME algorithm… (The SAME algorithm is also described in detail in the 2005 book Inference in Hidden Markov Models, Chapter 13.)
Chapter XVII of Keynes’ A Treatise On Probability contains Keynes’ most noteworthy contribution to Statistics, namely the classification of probability distributions such that the arithmetic/geometric/harmonic empirical mean/empirical median is also the maximum likelihood estimator. This problem was first stated by Laplace and Gauss (leading to Laplace distribution in connection with the median and to the Gaussian distribution for the arithmetic mean). The derivation of the densities of those probability distributions is based on the constraint the likelihood equation
is satisfied for one of the four empirical estimate, using differential calculus (despite the fact that Keynes earlier derived Bayes’ theorem by assuming the parameter space to be discrete). Under regularity assumptions, in the case of the arithmetic mean, my colleague Eric Séré showed me this indeed leads to the family of distributions
where and are almost arbitrary functions under the constraints that is twice differentiable and is a density in . This means that satisfies
a constraint missed by Keynes.
While I cannot judge of the level of novelty in Keynes’ derivation with respect to earlier works, this derivation therefore produces a generic form of unidimensional exponential family, twenty-five years before their rederivation by Darmois (1935), Pitman (1936) and Koopman (1936) as characterising distributions with sufficient statistics of constant dimensions. The derivation of the distributions for which the geometric or the harmonic means are MLEs then follows by a change of variables, or , respectively. In those different derivations, the normalisation issue is treated quite off-handedly by Keynes, witness the function
at the bottom of page 198, which is not integrable in unless its support is bounded away from 0 or . Similarly, the derivation of the log-normal density on page 199 is missing the Jacobian factor (or in Keynes’ notations) and the same problem arises for the inverse-normal density, which should be
instead of (page 200). At last, I find the derivation of the distributions linked with the median rather dubious since Keynes’ general solution
(where the integral ought to be interpreted as a primitive) is such that the recovery of Laplace’s distribution, involves setting (page 201)
hence making a function of as well. The summary two pages later actually produces an alternative generic form, namely
with the difficulties that the distribution only vaguely depends on , being then a step function times and that, unless is properly calibrated, also depends on .
Given that this part is the most technical section of the book, this post shows why I am fairly disappointed at having picked this book for my reading seminar. There is no further section with innovative methodological substance in the remainder of the book, which now appears to me as no better than a graduate dissertation on the probabilistic and statistical literature of the (not that) late 19th century, modulo the (inappropriate) highly critical tone.