Riemann, Langevin & Hamilton
In preparation for the Read Paper session next month at the RSS, our research group at CREST has collectively read the Girolami and Calderhead paper on Riemann manifold Langevin and Hamiltonian Monte Carlo methods and I hope we will again produce a joint arXiv preprint out of our comments. (The above picture is reproduced from Radford Neal’s talk at JSM 1999 in Baltimore, talk that I remember attending…) Although this only represents my preliminary/early impression on the paper, I have trouble with the Physics connection. Because it involves continuous time events that are not transcribed directly into the simulation process.
Overall, trying to take advantage of second order properties of the target—just like the Langevin improvement takes advantage of the first order—is a natural idea which, when implementable, can obviously speed up convergence. This is the Langevin part, which may use a fixed metric M or a local metric defining a Riemann manifold, G(θ). So far, so good, assuming the derivation of an observed or expected information G(θ) is feasible up to some approximation level. The Hamiltonian part that confuses me introduces a dynamic on level sets of
where p is an auxiliary vector of dimension D. Namely,
While I understand the purpose of the auxiliary vector, namely to speed up the exploration of the posterior surface by taking advantage of the additional energy provided by p, I fail to understand why the fact that the discretised (Euler) approximation to Hamilton’s equations is not available in closed form is such an issue…. The fact that the (deterministic?) leapfrog integrator is not exact should not matter since this can be corrected by a Metropolis-Hastings step.
While the logistic example is mostly a toy problem (where importance sampling works extremely well, as shown in our survey with Jean-Michel Marin), the stochastic volatility is more challenging and the fact that the Hamiltonian scheme applies to the missing data (volatility) as well as to the three parameters of the model is quite interesting. I however wonder at the appeal of this involved scheme when considering that the full conditional of the volatility can be simulated exactly…