## convergence of MCMC

Posted in Statistics with tags , , , , , , , , , on June 16, 2017 by xi'an

Michael Betancourt just posted on arXiv an historical  review piece on the convergence of MCMC, with a physical perspective.

“The success of these of Markov chain Monte Carlo, however, contributed to its own demise.”

The discourse proceeds through augmented [reality!] versions of MCMC algorithms taking advantage of the shape and nature of the target distribution, like Langevin diffusions [which cannot be simulated directly and exactly at the same time] in statistics and molecular dynamics in physics. (Which reminded me of the two parallel threads at the ICMS workshop we had a few years ago.) Merging into hybrid Monte Carlo, morphing into Hamiltonian Monte Carlo under the quills of Radford Neal and David MacKay in the 1990’s. It is a short entry (and so is this post), with some background already well-known to the community, but it nonetheless provides a perspective and references rarely mentioned in statistics.

## ABC’ory in Banff [17w5025]

Posted in Books, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on February 27, 2017 by xi'an

And another exciting and animated [last] day of ABC’ory [and practice]!  Kyle Cranmer exposed a density ratio density estimation approach I had not seen before [and will comment here soon]. Patrick Muchmore talked about unbiased estimators of Gaussian and non-Gaussian densities in elliptically contoured distributions which allows for running pseudo-MCMC than ABC. This reminded me of using the same tool [for those distributions can be expressed as mixtures of normals] in my PhD thesis, if for completely different purposes. In his talk, including a presentation of an ABC blackbox platform called ELFI, Samuel Kaski did translate statistical inference as inverse reinforcement learning: I hope this does not catch! In the afternoon, Dennis Prangle gave us the intuition behind his rare event ABC, which is not estimating rare events by ABC but rather using rare event simulation to improve ABC. [A paper I will a.s. comment here soon as well!] And Scott Sisson concluded the day and the week with his views on ABC for high dimensions.

While being obviously biased as the organiser of the workshop, I nonetheless feel it was a wonderful meeting with just the right number of participants to induce interactions and discussions during and around the talk, as well as preserve some time for pairwise interactions. Like all other workshops I contributed to in BIRS along the years

 07w5079 2007-07-01 Bioinformatics, Genetics and Stochastic Computation: Bridging the Gap 10w2170 2010-09-10 Hierarchical Bayesian Methods in Ecology 14w5125 2014-03-02 Advances in Scalable Bayesian Computation

this is certainly a highly profitable one! For a [major] change, the next one [18w5023] will take place in Oaxaca, Mexico, and will see computational statistics meet molecular simulation. [As an aside, here are the first and last slides of Ewan Cameron’s talk, appropriately illustrating beginning and end, for both themes of his talk: epidemiology and astronomy!]

## computational methods for statistical mechanics [day #4]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on June 7, 2014 by xi'an

My last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur’s Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

“The alanine dipeptide is the fruit fly of molecular simulation.”

The example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

## computational methods for statistical mechanics [day #3]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on June 6, 2014 by xi'an

The third day [morn] at our ICMS workshop was dedicated to path sampling. And rare events. Much more into [my taste] Monte Carlo territory. The first talk by Rosalind Allen looked at reweighting trajectories that are not in an equilibrium or are missing the Boltzmann [normalizing] constant. Although the derivation against a calibration parameter looked like the primary goal rather than the tool for constant estimation. Again papers in J. Chem. Phys.! And a potential link with ABC raised by Antonietta Mira… Then Jonathan Weare discussed stratification. With a nice trick of expressing the normalising constants of the different terms in the partition as solution(s) of a Markov system

$v\mathbf{M}=v$

Because the stochastic matrix M is easier (?) to approximate. Valleau’s and Torrie’s umbrella sampling was a constant reference in this morning of talks. Arnaud Guyader’s talk was in the continuation of Toni Lelièvre’s introduction, which helped a lot in my better understanding of the concepts. Rephrasing things in more statistical terms. Like the distinction between equilibrium and paths. Or bias being importance sampling. Frédéric Cérou actually gave a sort of second part to Arnaud’s talk, using importance splitting algorithms. Presenting an algorithm for simulating rare events that sounded like an opposite nested sampling, where the goal is to get down the target, rather than up. Pushing particles away from a current level of the target function with probability ½. Michela Ottobre completed the series with an entry into diffusion limits in the Roberts-Gelman-Gilks spirit when the Markov chain is not yet stationary. In the transient phase thus.

## computational methods for statistical mechanics [day #2]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , on June 5, 2014 by xi'an

The last “tutorial” talk at this ICMS workshop [“at the interface between mathematical statistics and molecular simulation”] was given by Tony Lelièvre on adaptive bias schemes in Langevin algorithms and on the parallel replica algorithm. This was both very interesting because of the potential for connections with my “brand” of MCMC techniques and rather frustrating as I felt the intuition behind the physical concepts like free energy and metastability was almost within my reach! The most manageable time in Tony’s talk was the illustration of the concepts through a mixture posterior example. Example that I need to (re)read further to grasp the general idea. (And maybe the book on Free Energy Computations Tony wrote with Mathias Rousset et Gabriel Stoltz.) A definitely worthwhile talk that I hope will get posted on line by ICMS. The other talks of the day were mostly of a free energy nature, some using optimised bias in the Langevin diffusion (except for Pierre Jacob who presented his non-negative unbiased estimation impossibility result).

## computational methods for statistical mechanics [day #1]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , on June 4, 2014 by xi'an

The first talks of the day at this ICMS workshop [“at the interface between mathematical statistics and molecular simulation”] were actually lectures introducing molecular simulation to statisticians by Michael Allen from Warwick and computational statistics to physicists by Omiros Papaspiliopoulos. Allen’s lecture was quite pedagogical, even though I had to quiz wikipedia for physics terms and notions. Like a force being the gradient of a potential function. He gave a physical meaning to Langevin’ equation. As well as references from the Journal of Chemical Physics that were more recent than 1953. He mentioned alternatives to Langevin’s equation too and I idly wondered at the possibility of using those alternatives as other tools for improved MCMC simulation. Although introducing friction may not be the most promising way to speed up the thing… He later introduced what statisticians call Langevin’ algorithm (MALA) as smart Monte Carlo (Rossky et al., …1978!!!). Recovering Hamiltonian and hybrid Monte Carlo algorithms as a fusion of molecular dynamics, Verlet algorithm, and Metropolis acceptance step! As well as reminding us of the physics roots of umbrella sampling and the Wang-Landau algorithm.

Omiros Papaspiliopoulos also gave a very pedagogical entry to the convergence of MCMC samplers which focussed on the L² approach to convergence. This reminded me of the very first papers published on the convergence of the Gibbs sampler, like the 1990 1992 JCGS paper by Schervish and Carlin. Or the 1991 1996 Annals of Statistics by Amit. (Funny that I located both papers much earlier than when they actually appeared!) One surprising fact was that the convergence of all reversible  ergodic kernels is necessarily geometric. There is no classification of kernels in this topology, the only ranking being through the respective spectral gaps. A good refresher for most of the audience, statisticians included.

The following talks of Day 1 were by Christophe Andrieu, who kept with the spirit of a highly pedagogical entry, covering particle filters, SMC, particle Gibbs and pseudo-marginals, and who hit the right tone I think given the heterogeneous audience. And by Ben Leimkuhler about particle simulation for very large molecular structures. Closing the day by focussing on Langevin dynamics. What I understood from the talk was an improved entry into the resolution of some SPDEs. Gaining two orders when compared with Euler-Marayama.  But missed the meaning of the friction coefficient γ converging to infinity in the title…

## computational methods for statistical mechanics

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , on May 30, 2014 by xi'an

Next weak (hopefully not weak!) week,  I will have the pleasure of visiting Scotland again! Indeed, I have been invited to take part to an ICMS workshop on the above topic, located “at the interface between mathematical statistics and molecular simulation”. A wonderful opportunity to meet researchers in computational physics, if challenging because of the different notations and focus as already experience im Hamburg. And to talk about some of my most current MCMC research, if I have time to modify my talk and complete a submission to NIPS… All this in the great environment of ICMS (International Centre for Mathemtical Sciences). And forecasting a pleasant time in Edinburgh, on Arthur’s Seat, and hopefully in the Scottish Highlands.