Yes, I was surprised as well, although this was actually pointed out in the original paper by Wang et al. Had to try it myself I presume..

]]>I remember trying HMC with leapfrog integrator on the Rosenbrock “banana-shaped” distribution and it worked only after I reduced the step size to about 1/500, even though it’s in 2 dimensions and it’s not really highly correlated.

The recent survey by Bou-Rabee and Sanz-Serna (https://arxiv.org/abs/1711.05337) and the book on “Geometric Numerical Integration” by Hairer, Wanner and Lubich are full of relevant results, and descriptions of other integrators which might perform better than the standard leapfrog integrator.

]]>Thank you Radford, I was just surprised at how easily this happens!

]]>Matt Hoffman burned an incredible quantity of compute cycles on our clusters to evaluate basic HMC vs. NUTS and a simple summary on some hard problems (250 dimensional normal with high correlation, stochastic volatility time series, hierarchical logistic regression, if I recall). Even optimally tuned HMC was inferior to NUTS in all of our experiments. We believe it’s not only tuning step size and integration time, but also hugely influenced by NUTS’s biasing the selection of the point on the Hamiltonian trajectory to be in the last doubling and hence likely farther away from the starting point. Betancourt found that NUTS lost of a lot of steam when he reworked adaptation but (initially) forgot the biasing of draws along the trajectory.

]]>