Archive for leapfrog integrator

robustified Hamiltonian

Posted in Books, Statistics, University life with tags , , , , , , , , , on April 1, 2022 by xi'an

In Gregynog, last week, Lionel Riou-Durant (Warwick) presented his recent work with Jure Vogrinc on Metropolis Adjusted Langevin Trajectories, which I had also heard in the Séminaire Parisien de Statistique two weeks ago. Starting with a nice exposition of Hamiltonian Monte Carlo, highlighting its drawbacks. This includes the potentially damaging impact of poorly tuning the integration time. Their proposal is to act upon the velocity in the Hamiltonian through Langevin (positive) damping, which also preserves the stationarity.  (And connects with randomised HMC.) One theoretical in the paper is that the Langevin diffusion achieves the fastest mixing rate among randomised HMCs. From a practical perspective, there exists a version of the leapfrog integrator that adapts to this setting and can be implemented as a Metropolis adjustment. (Hence the MALT connection.) An interesting feature is that the process as such is ergodic, which avoids renewal steps (and U-turns). (There are still calibration parameters to adjust, obviously.)

unbiased Hamiltonian Monte Carlo with couplings

Posted in Books, Kids, Statistics, University life with tags , , , , , , on October 25, 2019 by xi'an

In the June issue of Biometrika, which had been sitting for a few weeks on my desk under my teapot!, Jeremy Heng and Pierre Jacob published a paper on unbiased estimators for Hamiltonian Monte Carlo using couplings. (Disclaimer: I was not involved with the review or editing of this paper.) Which extends to HMC environments the earlier paper of Pierre Jacob, John O’Leary and Yves Atchadé, to be discussed soon at the Royal Statistical Society. The fundamentals are the same, namely that an unbiased estimator can be produced from a converging sequence of estimators and that it can be de facto computed if two Markov chains with the same marginal can be coupled. The issue with Hamiltonians is to figure out how to couple their dynamics. In the Gaussian case, it is relatively easy to see that two chains with the same initial momentum meet periodically. In general, there is contraction within a compact set (Lemma 1). The coupling extends to a time discretisation of the Hamiltonian flow by a leap-frog integrator, still using the same momentum. Which roughly amounts in using the same random numbers in both chains. When defining a relaxed meeting (!) where both chains are within δ of one another, the authors rely on a drift condition (8) that reminds me of the early days of MCMC convergence and seem to imply the existence of a small set “where the target distribution [density] is strongly log-concave”. And which makes me wonder if this small set could be used instead to create renewal events that would in turn ensure both stationarity and unbiasedness without the recourse to a second coupled chain. When compared on a Gaussian example with couplings on Metropolis-Hastings and MALA (Fig. 1), the coupled HMC sees hardly any impact of the dimension of the target (in the average coupling time), with a much lower value. However, I wonder at the relevance of the meeting time as an assessment of efficiency. In the sense that the coupling time is not a convergence time but reflects as well on the initial conditions. I acknowledge that this allows for an averaging over  parallel implementations but I remain puzzled by the statement that this leads to “estimators that are consistent in the limit of the number of replicates, rather than in the usual limit of the number of Markov chain iterations”, since a particularly poor initial distribution could on principle lead to a mode of the target being never explored or on the coupling time being ever so rarely too large for the computing abilities at hand.

revised empirical HMC

Posted in Statistics, University life with tags , , , , , , , , on March 12, 2019 by xi'an

Following the informed and helpful comments from Matt Graham and Bob Carpenter on our eHMC paper [arXival] last month, we produced a revised and re-arXived version of the paper based on new experiments ran by Changye Wu and Julien Stoehr. Here are some quick replies to these comments, reproduced for convenience. (Warning: this is a loooong post, much longer than usual.) Continue reading

computational statistics and molecular simulation [18w5023]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , on November 15, 2018 by xi'an

 I truly missed the gist of the first talk of the Wednesday morning of our X fertilisation workshop by Jianfeng Lu partly due to notations, although the topic very much correlated to my interests like path sampling, with an augmented version of HMC using an auxiliary indicator. And mentions made of BAOAB. Next, Marcello Pereyra spoke about Bayesian image analysis, with the difficulty of setting a prior on an image. In case of astronomical images there are motivations for an L¹ penalisation sparse prior. Sampling is an issue. Moreau-Yoshida proximal optimisation is used instead, in connection with our MCMC survey published in Stats & Computing two years ago. Transferability was a new concept for me, as introduced by Kerrie Mengersen (QUT), to extrapolate an estimated model to another system without using the posterior as a prior. With a great interlude about the crown of thorns starfish killer robot! Rather a prior determination based on historical data, in connection with recent (2018) Technometrics and Bayesian Analysis papers towards rejecting non-plausible priors. Without reading the papers (!), and before discussing the matter with Kerrie, here or in Marseille, I wonder at which level of precision this can be conducted. The use of summary statistics for prior calibration gave the approach an ABC flavour.

The hand-on session was Jonathan Mattingly’s discussion of gerrymandering reflecting on his experience at court! Hard to beat for an engaging talk reaching between communities. As it happens I discussed the original paper last year. Of course it was much more exciting to listen to Jonathan explaining his vision of the problem! Too bad I “had” to leave before the end for a [most enjoyable] rock climbing afternoon… To be continued at the dinner table! (Plus we got the complete explanation of the term gerrymandering, including this salamander rendering of the first identified as gerrymandered district!)

computational statistics and molecular simulation [18w5023]

Posted in Statistics with tags , , , , , , , , , , , , , , on November 13, 2018 by xi'an

This X fertilisation workshop Gabriel Stolz, Luke Bornn and myself organised towards reinforcing the interface between molecular dynamics and Monte Carlo statistical methods has now started! At the casa matematicà Oaxaca, the Mexican campus of BIRS, which is currently housed by a very nice hotel on the heights of Oaxaca. And after a fairly long flight for a large proportion of the participants. On the first day, Arthur Voter gave a fantastic “hand-on” review of molecular dynamics for material sciences, which was aimed at the statistician side of the audience and most helpful in my own understanding of the concepts and techniques at the source of HMC and PDMP algorithms. (Although I could not avoid a few mini dozes induced by jetlag.) Including the BAOAB version of HMC, which sounded to me like an improvement to investigate. The part on metastability, completed by a talk by Florian Maire, remained a wee bit mysterious [to me].

The shorter talks of the day all brought new perspectives and information to me (although they were definitely more oriented towards their “own” side of the audience than the hand-on lecture). For instance, Jesús María Sanz-Serna gave a wide ranging overview of numerical integrators and Tony Lelièvre presented a recent work on simulating measures supported by manifolds via an HMC technique constantly projecting over the manifold, with proper validation. (I had struggled with the paper this summer and this talk helped a lot.) There was a talk by Josh Fash on simulating implicit solvent models that mixed high-level programming and reversible jump MCMC, with an earlier talk by Yong Chen on variable dimension hidden Markov models that could have also alluded to reversible jump. Angela Bito talked about using ASIS (Ancillarity-sufficiency interweaving strategy) for improving the dynamics of an MCMC sampler associated with a spike & slab prior, the recentering-decentering cycle being always a sort of mystery to me [as to why it works better despite introducing multimodality in this case], and Gael Martin presented some new results on her on-going work with David Frazier about approximate Bayes with misspecified models, with the summary statistic being a score function that relates the work to the likelihood free approach of Bissiri et al.

%d bloggers like this: