MAP estimators (cont’d)

In connection with Anthony’s comments, here are the details for the normal example. I am using a flat prior on \mu when x\sim\mathcal{N}(\mu,1). The MAP estimator of \mu is then \hat\mu=x. If I consider the change of variable \mu=\text{logit}(\eta), the posterior distribution on \eta is

\pi(\eta|x) = \exp[ -(\text{logit}(\eta)-x)^2/2 ] / \sqrt{2\pi} \eta (1-\eta)

and the MAP in \eta is then obtained numerically. For instance, the R code

f=function(x,mea) dnorm(log(x/(1-x)),mean=mea)/(x*(1-x))
g=function(x){ a=optimise(f,int=c(0,1),maximum=TRUE,mea=x)$max;log(a/(1-a))}
plot(seq(0,4,.01),apply(as.matrix(seq(0,4,.01)),1,g),type="l",col="sienna",lwd=2)
abline(a=0,b=1,col="tomato2",lwd=2)

shows the divergence between the MAP estimator \hat\mu and the reverse transform of the MAP estimator \hat\eta of the transform… The second estimator is asymptotically (in x) equivalent to x+1.

An example I like very much in The Bayesian Choice is Example 4.1.2, when observing x\sim\text{Cauchy}(\theta,1) with a double exponential prior on \theta\sim\exp\{-|\theta|\}/2. The MAP is then always \hat\theta=0!

The dependence of the MAP estimator on the dominating measure is also studied in a BA paper by Pierre Druihlet and Jean-Michel Marin, who propose a solution that relies on Jeffreys’ prior as the reference measure.

4 Responses to “MAP estimators (cont’d)”

  1. […] estimators, and loss functions. I posted a while ago my perspective on MAP estimators, followed by several comments on the Bayesian nature of those estimators, hence will not reproduce them here, but the core of the […]

  2. From Jerymn (2005) [http://dx.doi.org/10.1214/009053604000001273] the answer to my question is apparently that this is indeed the unique way to obtain parametrization-invariant MAP estimates.

  3. Druilhet and Marin’s paper is very illuminating. Does this imply that whenever one wants a MAP estimate, they should compute it wrt a (Bernardo) reference prior dominating measure? I’m pretty sure this is not prevalent in machine learning, a literature in which MAP estimation is used fairly often. I think parametrization invariance is pretty crucial for meaningful inference. Do there exist other (non-trivial and statistically reasonable) candidates for dominating measures that will give identical MAP estimates under reparametrization?

    • Formally, you could define an arbitrary prior/reference measure on an arbitrary parameterisation and then impose invariance by reparameterisation. Of course this would be mostly meaningless…

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.