Archive for MLE

post-grading weekend

Posted in Kids, pictures, Statistics, University life with tags , , , , , , on January 19, 2015 by xi'an

IMG_2767Now my grading is over, I can reflect on the unexpected difficulties in the mathematical statistics exam. I knew that the first question in the multiple choice exercise, borrowed from Cross Validation, was going to  be quasi-impossible and indeed only one student out of 118 managed to find the right solution. More surprisingly, most students did not manage to solve the (absence of) MLE when observing that n unobserved exponential Exp(λ) were larger than a fixed bound δ. I was also amazed that they did poorly on a N(0,σ²) setup, failing to see that

\mathbb{E}[\mathbb{I}(X_1\le -1)] = \Phi(-1/\sigma)

and determine an unbiased estimator that can be improved by Rao-Blackwellisation. No student reached the conditioning part. And a rather frequent mistake more understandable due to the limited exposure they had to Bayesian statistics: many confused parameter λ with observation x in the prior, writing

\pi(\lambda|x) \propto \lambda \exp\{-\lambda x\} \times x^{a-1} \exp\{-bx\}

instead of

\pi(\lambda|x) \propto \lambda \exp\{-\lambda x\} \times \lambda^{a-1} \exp\{-b\lambda\}

hence could not derive a proper posterior.

paradoxes in scientific inference: a reply from the author

Posted in Books, Statistics, University life with tags , , , , , , , , , on December 26, 2012 by xi'an

(I received the following set of comments from Mark Chang after publishing a review of his book on the ‘Og. Here they are, verbatim, except for a few editing and spelling changes. It’s a huge post as Chang reproduces all of my comments as well.)

Professor Christian Robert reviewed my book: “Paradoxes in Scientific Inference”. I found that the majority of his criticisms had no foundation and were based on his truncated way of reading. I gave point-by-point responses below. For clarity, I kept his original comments.

Robert’s Comments: This CRC Press book was sent to me for review in CHANCE: Paradoxes in Scientific Inference is written by Mark Chang, vice-president of AMAG Pharmaceuticals. The topic of scientific paradoxes is one of my primary interests and I have learned a lot by looking at Lindley-Jeffreys and Savage-Dickey paradoxes. However, I did not find a renewed sense of excitement when reading the book. The very first (and maybe the best!) paradox with Paradoxes in Scientific Inference is that it is a book from the future! Indeed, its copyright year is 2013 (!), although I got it a few months ago. (Not mentioning here the cover mimicking Escher’s “paradoxical” pictures with dices. A sculpture due to Shigeo Fukuda and apparently not quoted in the book. As I do not want to get into another dice cover polemic, I will abstain from further comments!)

Thank you, Robert for reading and commenting on part of my book. I had the same question on the copyright year being 2013 when it was actually published in previous year. I believe the same thing had happened to my other books too. The incorrect year causes confusion for future citations. The cover was designed by the publisher. They gave me few options and I picked the one with dices. I was told that the publisher has the copyright for the art work. I am not aware of the original artist. Continue reading

estimating the measure and hence the constant

Posted in pictures, Running, Statistics, University life with tags , , , , , , , on December 6, 2012 by xi'an

Dawn in Providence, Nov. 30, 2012As mentioned on my post about the final day of the ICERM workshop, Xiao-Li Meng addresses this issue of “estimating the constant” in his talk. It is even his central theme. Here are his (2011) slides as he sent them to me (with permission to post them!):

He therefore points out in slide #5 why the likelihood cannot be expressed in terms of the normalising constant because this is not a free parameter. Right! His explanation for the approximation of the unknown constant is then to replace the known but intractable dominating measure—in the sense that it cannot compute the integral—with a discrete (or non-parametric) measure supported by the sample. Because the measure is defined up to a constant, this leads to sample weights being proportional to the inverse density. Of course, this representation of the problem is open to criticism: why focus only on measures supported by the sample? The fact that it is the MLE is used as an argument in Xiao-Li’s talk, but this can alternatively be seen as a drawback: I remember reviewing Dankmar Böhning’s Computer-Assisted Analysis of Mixtures and being horrified when discovering this feature! I am currently more agnostic since this appears as an alternative version of empirical likelihood. There are still questions about the measure estimation principle: for instance, when handling several samples from several distributions, why should they all contribute to a single estimate of μ rather than to a product of measures? (Maybe because their models are all dominated by the same measure μ.) Now, getting back to my earlier remark, and as a possible answer to Larry’s quesiton, there could well be a Bayesian version of the above, avoiding the rough empirical likelihood via Gaussian or Drichlet process prior modelling.

bounded normal mean

Posted in R, Statistics, University life with tags , , , , , , , , , on November 25, 2011 by xi'an

A few days ago, one of my students, Jacopo Primavera (from La Sapienza, Roma) presented his “reading the classic” paper, namely the terrific bounded normal mean paper by my friends George Casella and Bill Strawderman (1981, Annals of Statistics). Even though I knew this paper quite well, having read (and studied) it myself many times, starting in 1987 in Purdue with Mary Ellen Bock, it was a pleasure to spend another hour on it, as I came up with new perspectives and new questions. Above are my scribbled notes on the back of the [Epson] beamer documentation. One such interesting question is whether or not it is possible to devise a computer code that would [approximately] produce the support of the least favourable prior for a given bound m (in a reasonable time). Another open question is to find the limiting bounds for which a 2 point, a 3 point, &tc., support prior is the least favourable prior. This was established in Casella and Strawderman for bounds less than 1.08 and for bounds between 1.4 and 1.6, but I am not aware of other results in that direction… Here are the slides used by Jacopo:

MAP, MLE and loss

Posted in Statistics with tags , , , , on April 25, 2011 by xi'an

Michael Evans and Gun Ho Jang posted an arXiv paper where they discuss the connection between MAP, least relative surprise (or maximum profile likelihood) estimators, and loss functions. I posted a while ago my perspective on MAP estimators, followed by several comments on the Bayesian nature of those estimators, hence will not reproduce them here, but the core of the matter is that neither MAP estimators, nor MLEs are really justified by a decision-theoretic approach, at least in a continuous parameter space. And that the dominating measure [arbitrarily] chosen on the parameter space impacts the value of the MAP, as demonstrated by Druihlet and Marin in 2007.

Continue reading


Posted in Kids, pictures, University life with tags , , on February 19, 2011 by xi'an

I was visiting Jean-Michel Marin over the past two days in order to finalise our paper on ABC model choice and I noticed this very special exam on his wall. It was a copy made by his son, who is currently learning his letters, of a true exam Jean-Michel was grading a few weeks ago. Even though the picture is over-zoomed, it is possible to identify the (correct!) resolution of the MLE of the upper bound of a uniform distribution. A very cute rendering that also qualifies as Art brut! (In the spirit of Pierre Ménard, Borges’ short story about re-creation)


Get every new post delivered to your Inbox.

Join 777 other followers