high dimension Metropolis-Hastings algorithms
When discussing high dimension models with Ingmar
Schüster Schuster [blame my fascination for accented characters!] the other day, we came across the following paradox with Metropolis-Hastings algorithms. If attempting to simulate from a multivariate standard normal distribution in a large dimension, when starting from the mode of the target, i.e., its mean γ, leaving the mode γis extremely unlikely, given the huge drop between the value of the density at the mode γ and at likely realisations (corresponding to the blue sequence). Even when relying on the very scale that makes the proposal identical to the target! Resorting to a tiny scale like Σ/p manages to escape the unhealthy neighbourhood of the highly unlikely mode (as shown with the brown sequence).
Here is the corresponding R code:
p=100 T=1e3 mh=mu #mode as starting value vale=rep(0,T) for (t in 1:T){ prop=mvrnorm(1,mh,sigma/p) if (log(runif(1))<logdmvnorm(prop,mu,sigma)- logdmvnorm(mh,mu,sigma)) mh=prop vale[t]=logdmvnorm(mh,mu,sigma)}
February 6, 2016 at 6:58 pm
[…] Source: high dimension Metropolis-Hastings algorithms […]
January 26, 2016 at 7:48 pm
Another way to overcome this would be to accept your first proposal with probability 1, but I can’t say how well this would generalize…
January 26, 2016 at 10:13 pm
Yes, it would work out if the proposal was independent rather than random-walk-like.
January 26, 2016 at 7:22 pm
[…] high dimension Metropolis-Hastings algorithms When discussing high dimension models with Ingmar Schüster Schuster [blame my fascination for accented characters!] the other day, we came across the following paradox with Metropolis-Hastings algorithms. If attempting to simulate from a multivariate standard normal distribution in a large dimension, when starting from the mode of the target, i.e., its mean γ, leaving the mode γis extremely unlikely, given the huge drop between the value of the density at the mode γ and at likely realisations (corresponding to the blue sequence). Even when relying on the very scale that makes the proposal identical to the target! Resorting to a tiny scale like Σ/p manages to escape the unhealthy neighbourhood of the highly unlikely mode (as shown with the brown sequence). […]
January 26, 2016 at 9:47 am
Thats a nice way to point out the french pronounciation of my last name :D
Anyway, related to that, Nicolas gave me the intuition for whats happening for Gaussians in high dimension: https://ingmarschuster.wordpress.com/2016/01/21/why-the-map-is-a-bad-starting-point-in-high-dimensions/
January 26, 2016 at 10:12 pm
Yes, this is clearly the reason, as I thought we has discussed last week… We came across the same phenomenon with Jean-Michel a while ago when testing EP and Simon and Nicolas pointed out the same explanation about the chi² distribution of minus the log-likelihood.