## Archive for HMC

## BayesComp 2020 at a glance

Posted in Statistics, Travel, University life with tags ABC, BayesComp 2020, Bayesian computation, Bayesian nonparametrics, conference, Gainesville, Gaussian processes, HMC, ISBA, MCMC, non-reversible diffusion, poster session, reversible Markov chain, simulation, University of Florida, USA, Wasserstein distance on December 18, 2019 by xi'an## dynamic nested sampling for stars

Posted in Books, pictures, Statistics, Travel with tags astrostatistics, Biometrika, black holes, cross validated, dynesty, effective sample size, emcee, ESS, evidence, Hamiltonian Monte Carlo, HMC, Multinest, nested sampling, NUTS, order statistics, prior distributions, slice sampling, The Astrophysical Journal Letters on April 12, 2019 by xi'an**I**n the sequel of earlier nested sampling packages, like MultiNest, Joshua Speagle has written a new package called dynesty that manages dynamic nested sampling, primarily intended for astronomical applications. Which is the field where nested sampling is the most popular. One of the first remarks in the paper is that nested sampling can be more easily implemented by using a Uniform reparameterisation of the prior, that is, a reparameterisation that turns the prior into a Uniform over the unit hypercube. Which means *in fine* that the prior distribution can be generated from a fixed vector of uniforms and known transforms. Maybe not such an issue given that this is *the prior* after all. The author considers this makes sampling under the likelihood constraint a much simpler problem but it all depends in the end on the concentration of the likelihood within the unit hypercube. And on the ability to reach the higher likelihood slices. I did not see any special trick when looking at the documentation, but reflected on the fundamental connection between nested sampling and this ability. As in the original proposal by John Skilling (2006), the slice volumes are “estimated” by simulated Beta order statistics, with no connection with the actual sequence of simulation or the problem at hand. We did point out our incomprehension for such a scheme in our Biometrika paper with Nicolas Chopin. As in earlier versions, the algorithm attempts at visualising the slices by different bounding techniques, before proceeding to explore the bounded regions by several exploration algorithms, including HMC.

“As with any sampling method, we strongly advocate that Nested Sampling should not be viewed as being strictly“better” or “worse” than MCMC, but rather as a tool that can be more or less useful in certain problems. There is no “One True Method to Rule Them All”, even though it can be tempting to look for one.”

When introducing the dynamic version, the author lists three drawbacks for the static (original) version. One is the reliance on this transform of a Uniform vector over an hypercube. Another one is that the overall runtime is highly sensitive to the choice the prior. (If simulating from the prior rather than an importance function, as suggested in our paper.) A third one is the issue that nested sampling is impervious to the final goal, evidence approximation versus posterior simulation, i.e., uses a constant rate of prior integration. The dynamic version simply modifies the number of point simulated in each slice. According to the (relative) increase in evidence provided by the current slice, estimated through iterations. This makes nested sampling a sort of inversted Wang-Landau since it sharpens the difference between slices. (The dynamic aspects for estimating the volumes of the slices and the stopping rule may hinder convergence in unclear ways, which is not discussed by the paper.) Among the many examples produced in the paper, a 200 dimension Normal target, which is an interesting object for posterior simulation in that most of the posterior mass rests on a ring away from the maximum of the likelihood. But does not seem to merit a mention in the discussion. Another example of heterogeneous regression favourably compares dynesty with MCMC in terms of ESS (but fails to include an HMC version).

*[Breaking News: Although I wrote this post before the exciting first image of the black hole in M87 was made public and hence before I was aware of it, the associated AJL paper points out relying on dynesty for comparing several physical models of the phenomenon by nested sampling.]*

## faster HMC [poster at CIRM]

Posted in Statistics with tags CIRM, eHMC, HMC, Jean Morlet Chair, Luminy, Monte Carlo Statistical Methods, NUTS, poster, Université Aix Marseille on November 26, 2018 by xi'an## talks at CIRM with special tee-shirts

Posted in Books, pictures, Statistics, University life with tags Þe Norse face, Bayesian Analysis, Centre International de Rencontres Mathématiques, CIRM, CNRS, HMC, JASP, logo, Luminy, Marseiile, master class, Monte Carlo Statistical Methods, STAN, tee-shirt, Université Aix Marseille, videoed lectures, ye Norse farce on November 21, 2018 by xi'an## computational statistics and molecular simulation [18w5023]

Posted in pictures, Statistics, Travel, University life with tags 18w5023, Benzécri, BIRS, Casa Matemática Oaxaca, CMO, computational statistics, HMC, Jussieu, Mexico, molecular dynamics, Monte Carlos Statistical Methods, nested sampling, numerical integrator, path sampling, workshop on November 19, 2018 by xi'an**T**he last day of the X fertilisation workshop at the casa matematicà Oaxaca, there were only three talks and only half of the participants. I lost the subtleties of the first talk by Andrea Agazzi on large deviations for chemical reactions, due to an emergency at work (Warwick). The second talk by Igor Barahona was somewhat disconnected from the rest of the conference, working on document textual analysis by way of algebraic data analysis (analyse des données) methods à la Benzécri. (Who was my office neighbour at Jussieu in the early 1990s.) In the last and final talk, Eric Vanden-Eijden made a link between importance sampling and PDMP, as an integral can be expressed via a trajectory of a path. A generalisation of path sampling, for almost any ODE. But also a competitor to nested sampling, waiting for the path to reach an Hamiltonian level, without some of the difficulties plaguing nested sampling like resampling. And involving continuous time processes. (Is there a continuous time version of ABC as well?!) Returning unbiased estimators of mean (the original integral) and variance. Example of a mixture example in dimension d=10 with k=50 components using only 100 paths.

## computational statistics and molecular simulation [18w5023]

Posted in pictures, Statistics, Travel, University life with tags 18w5023, BIRS, Casa Matemática Oaxaca, CMO, computational statistics, HMC, hypocoercivity, Institut Henri Poincaré, Mexico, molecular dynamics, Monte Carlos Statistical Methods, overdamped Langevin algorithm, PDMP, workshop on November 16, 2018 by xi'an**T**his Thursday, our X fertilisation workshop at the interface between molecular dynamics and Monte Carlo statistical methods saw a wee bit of reduction in the audience as some participants had already left Oaxaca. Meaning they missed the talk of Christophe Andrieu on hypocoercivity which could have been another hand-on lecture, given the highly pedagogical contents of the talk. I had seen some parts of the talk in MCqMC 2018 in Rennes and at NUS, but still enjoyed the whole of it very much, and so did the audience given the induced discussion. For instance, previously, I had not seen the connection between the guided random walks of Gustafson and Diaconis, and continuous time processes like PDMP. Which Christophe also covered in his talk. (Also making me realise my colleague Jean Dolbeault in Dauphine was strongly involved in the theoretical analysis of PDMPs!) Then Samuel Power gave another perspective on PDMPs. With another augmentation, connected with time, what he calls trajectorial reversibility. This has the impact of diminishing the event rate, but creates some kind of reversibility which seems to go against the motivation for PDMPs. (Remember that all talks are available as videos on the BIRS webpage.) A remark in the talk worth reiterating is the importance of figuring out which kinds of approximations are acceptable in these approximations. Connecting somewhat with the next talk by Luc Rey-Bellet on a theory of robust approximations. In the sense of Poincaré, Gibbs, Bernstein, &tc. concentration inequalities and large deviations. With applications to rare events.The fourth and final “hand-on” session was run by Miranda Holmes-Certon on simulating under constraints. Motivated by research on colloids. For which the overdamp Langevin diffusion applies as an accurate model, surprisingly. Which makes a major change from the other talks [most of the workshop!] relying on this diffusion. (With an interesting intermede on molecular velcro made of DNA strands.) Connected with this example, exotic energy landscapes are better described by hard constraints. (Potentially interesting extension to the case when there are too many constraints to explore all of them?) Now, the definition of the measure projected on the manifold defined by the constraints is obviously an important step in simulating the distribution, which density is induced by the gradient of the constraints ∇q(x). The proposed algorithm is in the same spirit as the one presented by Tony the previous day, namely moving along the tangent space then on the normal space to get back to the manifold. A solution that causes issues when the gradient is (near) zero. A great hand-on session which induced massive feedback from the audience.

In the afternoon session, Gersende Fort gave a talk on a generalisation of the Wang-Landau algorithm, which modifies the true weights of the elements of a partition of the sampling space, to increase visits to low [probability] elements and jumps between modes. The idea is to rely on tempered versions of the original weights, learned by stochastic approximation. With an extra layer of adaptivity. Leading to an improvement with parameters that depends on the phase of the stochastic approximation. The second talk was by David Sanders on a recent paper in *Chaos* about importance sampling for rare events of (deterministic) billiard dynamics. With diffusive limits which tails are hard to evaluate, except by importance sampling. And the last talk of the day was by Anton Martinsson on simulated tempering for a molecular alignment problem. With weights of different temperatures proportional to the inverse of the corresponding normalising constants, which themselves can be learned by a form of bridge sampling if I got it right.

On a very minor note, I heard at breakfast a pretty good story from a fellow participant having to give a talk at a conference that was moved to a very early time in the morning due to an official appearing at a later time and as a result “enjoying” a very small audience to the point that a cleaning lady appeared and started cleaning the board as she could not conceive the talks had already started! Reminding me of this picture at IHP.