MAP, MLE and loss

Michael Evans and Gun Ho Jang posted an arXiv paper where they discuss the connection between MAP, least relative surprise (or maximum profile likelihood) estimators, and loss functions. I posted a while ago my perspective on MAP estimators, followed by several comments on the Bayesian nature of those estimators, hence will not reproduce them here, but the core of the matter is that neither MAP estimators, nor MLEs are really justified by a decision-theoretic approach, at least in a continuous parameter space. And that the dominating measure [arbitrarily] chosen on the parameter space impacts the value of the MAP, as demonstrated by Druihlet and Marin in 2007.

Evans and Jang start in the finite case with the loss function

\mathrm{L}(\theta,d) = \mathbb{I}\{\Psi(\theta) \ne d) / \pi_\Psi(\Psi(\theta))

where they consider the estimation of the transform Ψ(θ) by d, inversely weighted by the marginal prior on this transform. In the special case of the identity transform, this loss function leads to the MLE as the Bayes estimator. In the general case, the Bayes estimator is the maximum profile likelihood estimator (LRSE). However, this loss function does not generalise to countably infinite (and obviously continuous) parameter spaces and in such settings the authors can only provide LRSEs as limits of Bayes procedures. The loss function adopted in the countable case is for instance

\mathrm{L}(\theta,d) = \mathbb{I}\{\Psi(\theta) \ne d\} / \max\{\eta,\pi_\Psi(\Psi(\theta))\}

with the bound decreasing to zero. In the continuous case, the indicator does not work any longer, thus the choice made by the authors is to discretise the space Ψ(Θ) by a specific choice of a partition of balls whose diameters λ go to zero. In the spirit of Druihlet and Marin, this choice depends on a metric (and further regularity conditions). Furthermore, the LRSE itself

\max_{\psi}\pi_\psi(\psi|x)/\pi_\psi(\theta)

does depend on the version chosen for the densities (if not on the dominating measure), unless one imposes everywhere the Bayes equality

\pi_{\psi}(\psi|x)/\pi_\psi(\theta)=f(x|\psi)/m(x)

everywhere, when

f(x|\psi)=\int_{\{\theta;\Psi(\theta)=\psi\}}f(x|\theta)\pi(\theta)\mathrm{d}\theta

and

m(x)=\int f(x|\theta)\pi(\theta)\mathrm{d}\theta

in the spirit of our Savage-Dickey paradox paper. In conclusion, while I do appreciate the effort to embed both MAP and [profile] likelihood estimators within a Bayesian decision-theoretic setting, the paper does not really validate those, in the sense that limits of Bayes estimators are not necessarily optimal.

2 Responses to “MAP, MLE and loss”

  1. […] statement that for a Bayesian, there is no fixed value for the parameter!) The statement that the MAP estimator is associated with the 0-1 loss function (footnote 4, p.10) is alas found in many books and papers, […]

  2. […] MAP, MLE and loss « Xi'an's Og […]

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.