MAP estimators (cont’d)

In connection with Anthony’s comments, here are the details for the normal example. I am using a flat prior on $\mu$ when $x\sim\mathcal{N}(\mu,1)$. The MAP estimator of $\mu$ is then $\hat\mu=x$. If I consider the change of variable $\mu=\text{logit}(\eta)$, the posterior distribution on $\eta$ is

$\pi(\eta|x) = \exp[ -(\text{logit}(\eta)-x)^2/2 ] / \sqrt{2\pi} \eta (1-\eta)$

and the MAP in $\eta$ is then obtained numerically. For instance, the R code

f=function(x,mea) dnorm(log(x/(1-x)),mean=mea)/(x*(1-x))
g=function(x){ a=optimise(f,int=c(0,1),maximum=TRUE,mea=x)\$max;log(a/(1-a))}
plot(seq(0,4,.01),apply(as.matrix(seq(0,4,.01)),1,g),type="l",col="sienna",lwd=2)
abline(a=0,b=1,col="tomato2",lwd=2)

shows the divergence between the MAP estimator $\hat\mu$ and the reverse transform of the MAP estimator $\hat\eta$ of the transform… The second estimator is asymptotically (in $x$) equivalent to $x+1$.

An example I like very much in The Bayesian Choice is Example 4.1.2, when observing $x\sim\text{Cauchy}(\theta,1)$ with a double exponential prior on $\theta\sim\exp\{-|\theta|\}/2$. The MAP is then always $\hat\theta=0$!

The dependence of the MAP estimator on the dominating measure is also studied in a BA paper by Pierre Druihlet and Jean-Michel Marin, who propose a solution that relies on Jeffreys’ prior as the reference measure.

4 Responses to “MAP estimators (cont’d)”

1. […] estimators, and loss functions. I posted a while ago my perspective on MAP estimators, followed by several comments on the Bayesian nature of those estimators, hence will not reproduce them here, but the core of the […]

2. From Jerymn (2005) [http://dx.doi.org/10.1214/009053604000001273] the answer to my question is apparently that this is indeed the unique way to obtain parametrization-invariant MAP estimates.

3. Druilhet and Marin’s paper is very illuminating. Does this imply that whenever one wants a MAP estimate, they should compute it wrt a (Bernardo) reference prior dominating measure? I’m pretty sure this is not prevalent in machine learning, a literature in which MAP estimation is used fairly often. I think parametrization invariance is pretty crucial for meaningful inference. Do there exist other (non-trivial and statistically reasonable) candidates for dominating measures that will give identical MAP estimates under reparametrization?

• Formally, you could define an arbitrary prior/reference measure on an arbitrary parameterisation and then impose invariance by reparameterisation. Of course this would be mostly meaningless…

This site uses Akismet to reduce spam. Learn how your comment data is processed.