Archive for Monte Carlo Statistical Methods

faster HMC [poster at CIRM]

Posted in Statistics with tags , , , , , , , , on November 26, 2018 by xi'an

computational statistics and molecular simulation [18w5023]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , on November 14, 2018 by xi'an

On Day 2, Carsten Hartmann used a representation of the log cumulant as solution to a minimisation problem over a collection of importance functions (by the Vonsker-Varadhan principle), with links to X entropy and optimal control, a theme also considered by Alain Dunmus when considering the uncorrected discretised Langevin diffusion with a decreasing sequence of discretisation scale factors (Jordan, Kinderlehrer and Otto) in the spirit of convex regularisation à la Rockafellar. Also representing ULA as an inexact gradient descent algorithm. Murray Pollock (Warwick) presented a new technique called fusion to simulate from products of d densities, as in scalable MCMC (but not only). With an (early) starting and startling remark that when simulating one realisation from each density in the product and waiting for all of them to be equal means simulating from the product, in a strong link to the (A)BC fundamentals. This is of course impractical and Murray proposes to follow d Brownian bridges all ending up in the average of these simulations, constructing an acceptance probability that is computable and validating the output.

The second “hand-on” lecture was given by Gareth Roberts (Warwick) on the many aspects of scaling MCMC algorithms, which started with the famous 0.234 acceptance rate paper in 1996. While I was aware of some of these results (!), the overall picture was impressive, including a notion of complexity I had not seen before. And a last section on PDMPs where Gareth presented very recent on the different scales of convergence of Zigzag and bouncy particle samplers, mostly to the advantage of Zigzag.In the afternoon, Jeremy Heng presented a continuous time version of simulated tempering by adding a drift to the Langevin diffusion with time-varying energy, which must be solution to the Liouville pde \text{div} \pi_t f = \partial_t \pi_t. Which connects to a flow transport problem when solving the pde under additional conditions. Unclear to me was the creation of the infinite sequence. This talk was very much at the interface in the spirit of the workshop! (Maybe surprisingly complex when considering the endpoint goal of simulating from a given target.) Jonathan Weare’s talk was about quantum chemistry which translated into finding eigenvalues of an operator. Turning in to a change of basis in a inhumanly large space (10¹⁸⁰ dimensions!). Matt Moore presented the work on Raman spectroscopy he did while a postdoc at Warwick, with an SMC based classification of the peaks of a spectrum (to be used on Mars?) and Alessandra Iacobucci (Dauphine) showed us the unexpected thermal features exhibited by simulations of chains of rotors subjected to both thermal and mechanical forcings, which we never discussed in Dauphine beyond joking on her many batch jobs running on our cluster!

And I remembered today that there is currently and in parallel another BIRS workshop on statistical model selection [and a lot of overlap with our themes] taking place in Banff! With snow already there! Unfair or rather #unfair, as someone much too well-known would whine..! Not that I am in a position to complain about the great conditions here in Oaxaca (except for having to truly worry about stray dogs rather than conceptually about bears makes running more of a challenge, if not the altitude since both places are about the same).

I thought I did make a mistake but I was wrong…

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , , on November 14, 2018 by xi'an

One of my students in my MCMC course at ENSAE seems to specialise into spotting typos in the Monte Carlo Statistical Methods book as he found an issue in every problem he solved! He even went back to a 1991 paper of mine on Inverse Normal distributions, inspired from a discussion with an astronomer, Caroline Soubiran, and my two colleagues, Gilles Celeux and Jean Diebolt. The above derivation from the massive Gradsteyn and Ryzhik (which I discovered thanks to Mary Ellen Bock when arriving in Purdue) is indeed incorrect as the final term should be the square root of 2β rather than 8β. However, this typo does not impact the normalising constant of the density, K(α,μ,τ), unless I am further confused.

short course on MCMC at CiRM [slides]

Posted in Statistics with tags , , , , , , , , , , , , , , , on October 23, 2018 by xi'an

Here are the [recycled] slides for the introductory lecture I gave this morning at CIRM, with the side information that it appears Slideshare has gone to another of these stages when slides cannot be played on this blog [when using Firefox]…

more and more control variates

Posted in Statistics with tags , , , , , , , on October 5, 2018 by xi'an

A few months ago, François Portier and Johan Segers arXived a paper on a question that has always puzzled me, namely how to add control variates to a Monte Carlo estimator and when to stop if needed! The paper is called Monte Carlo integration with a growing number of control variates. It is related to the earlier Oates, Girolami and Chopin (2017) which I remember discussing with Chris when he was in Warwick. The puzzling issue of control variates is [for me] that, while the optimal weight always decreases the variance of the resulting estimate, in practical terms, implementing the method may increase the actual variance. Glynn and Szechtman at MCqMC 2000 identify six different ways of creating the estimate, depending on how the covariance matrix, denoted P(hh’), is estimated. With only one version integrating constant functions and control variates exactly. Which actually happens to be also a solution to a empirical likelihood maximisation under the empirical constraints imposed by the control variates. Another interesting feature is that, when the number m of control variates grows with the number n of simulations the asymptotic variance goes to zero, meaning that the control variate estimator converges at a faster speed.

Creating an infinite sequence of control variates sounds unachievable in a realistic situation. Legendre polynomials are used in the paper, but is there a generic and cheap way to getting these. And … control variate selection, anyone?!

Gibbs for incompatible kids

Posted in Books, Statistics, University life with tags , , , , , , , , , , on September 27, 2018 by xi'an

In continuation of my earlier post on Bayesian GANs, which resort to strongly incompatible conditionals, I read a 2015 paper of Chen and Ip that I had missed. (Published in the Journal of Statistical Computation and Simulation which I first confused with JCGS and which I do not know at all. Actually, when looking at its editorial board,  I recognised only one name.) But the study therein is quite disappointing and not helping as it considers Markov chains on finite state spaces, meaning that the transition distributions are matrices, meaning also that convergence is ensured if these matrices have no null probability term. And while the paper is motivated by realistic situations where incompatible conditionals can reasonably appear, the paper only produces illustrations on two and three states Markov chains. Not that helpful, in the end… The game is still afoot!