Archive for Langevin diffusion

non-reversible MCMC [comments]

Posted in Books, Mountains, pictures, Statistics, University life with tags , , , , , on May 26, 2015 by xi'an

[Here are comments made by Matt Graham that I thought would be more readable in a post format. The beautiful picture of the Alps above is his as well. I do not truly understand what Matt’s point is, as I did not cover continuous time processes in my discussion…]

In terms of interpretation of the diffusion with non-reversible drift component, I think this can be generalised from the Gaussian invariant density case by

dx = [ – (∂E/∂x) dt + √2 dw ] + S’ (∂E/∂x) dt

where ∂E/∂x is the usual gradient of the negative log (unnormalised) density / energy and S=-S’ is a skew symmetric matrix. In this form it seems the dynamic can be interpreted as the composition of an energy and volume conserving (non-canonical) Hamiltonian dynamic

dx/dt = S’ ∂E/∂x

and a (non-preconditioned) Langevin diffusion

dx = – (∂E/∂x) dt + √2 dw

As an alternative to discretising the combined dynamic, it might be interesting to compare to sequential alternation between ‘Hamiltonian’ steps either using a simple Euler discretisation

x’ = x + h S’ ∂E/∂x

or a symplectic method like implicit midpoint to maintain reversibility / volume preservation and then a standard MALA step

x’ = x – h (∂E/∂x) + √2 h w, w ~ N(0, I)

plus MH accept. If only one final MH accept step is done this overall dynamic will be reversible, however if a an intermediary MH accept was done after each Hamiltonian step (flipping the sign / transposing S on a rejection to maintain reversibility), the composed dynamic would in general be non-longer reversible and it would be interesting to compare performance in this case to that using a non-reversible MH acceptance on the combined dynamic (this alternative sidestepping the issues with finding an appropriate scale ε to maintain the non-negativity condition on the sum of the vorticity density and joint density on a proposed and current state).

With regards to your point on the positivity of g(x,y)+π(y)q(y,x), I’m not sure if I have fully understood your notation correctly or not, but I think you may have misread the definition of g(x,y) for the discretised Ornstein-Uhlenbeck case (apologies if instead the misinterpretation is mine!). The vorticity density is defined as the skew symmetric component of the density f of F(dx, dy) = µ(dx) Q(x, dy) with respect to the Lebesgue measure, where µ(dx) is the true invariant distribution of the Euler-Maruyama discretised diffusion based proposal density Q(x, dy) rather than g(x, y) being defined in terms of the skew-symmetric component of π(dx) Q(x, dy) which in general would lead to a vorticity density which does not meet the zero integral requirement as the target density π is not invariant in general with respect to Q.

computational methods for statistical mechanics [day #4]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on June 7, 2014 by xi'an

Arthur Seat, Edinburgh, Sep. 7, 2011

My last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur’s Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

“The alanine dipeptide is the fruit fly of molecular simulation.”

The example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

MCqMC 2014 [day #3]

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , , , on April 10, 2014 by xi'an


As the second day at MCqMC 2014, was mostly on multi-level Monte Carlo and quasi-Monte Carlo methods, I did not attend many talks but had a long run in the countryside (even saw a pheasant and a heron), worked at “home” on pressing recruiting evaluations and had a long working session with Pierre Jacob. Plus an evening out sampling (just) a few Belgian beers in the shade of the city hall…

Today was more in my ballpark as there were MCMC talks the whole day! The plenary talk was not about MCMC as Erich Novak presented a survey on the many available results bounding the complexity of approximating an integral based on a fixed number of evaluations of the integrand, some involving the dimension (and its curse), some not, some as fast as √n and some not as fast, all this depending on the regularity and the size of the classes of integrands considered. In some cases, the solution was importance sampling, in other cases, quasi-Monte Carlo, and yet other cases were still unsolved. Then Yves Atchadé gave a new perspective on computing the asymptotic variance in the central limit theorem on Markov chains when truncating the autocovariance, Matti Vihola talked about theoretical orderings of Markov chains that transmuted into the very practical consequence that using more simulations in a pseudo-marginal likelihood approximation improved acceptance rate and asymptotic variances (and this applies to aBC-MCMC as well), Radu Craiu proposed a novel processing of adaptive MCMC by treating various approximations to the true target as food for a multiple-try Metropolis algorithm, and Luca Martino had a go at resuscitating the ARMS algorithm of Gilks and Wild (used for a while in BUGS), although the talk did not dissipate all of my misgivings about the multidimensional version! I had more difficulties following the “Warwick session” which was made of four talks by current or former students from Warwick, although I appreciated the complexity of the results in infinite dimensional settings and novel approximations to diffusion based Metropolis algorithms. No further session this afternoon as the “social” activity was to visit the nearby Stella Artois brewery! This activity made us very social, for certain, even though there was hardly a soul around in this massively automated factory. (Maybe an ‘Og post to come one of those days…)

new MCMC algorithm for Bayesian variable selection

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on February 25, 2014 by xi'an

Flight from Bristol to Amsterdam, April 03, 2011Unfortunately, I will miss the incoming Bayes in Paris seminar next Thursday (27th February), as I will be flying to Montréal and then Québec at the time (despite having omitted to book a flight till now!). Indeed Amandine Shreck will give a talk at 2pm in room 18 of ENSAE, Malakoff, on A shrinkage-thresholding Metropolis adjusted Langevin algorithm for Bayesian variable selection, a work written jointly with Gersende Fort, Sylvain Le Corff, and Eric Moulines, and arXived at the end of 2013 (which may explain why I missed it!). Here is the abstract:

This paper introduces a new Markov Chain Monte Carlo method to perform Bayesian variable selection in high dimensional settings. The algorithm is a Hastings-Metropolis sampler with a proposal mechanism which combines (i) a Metropolis adjusted Langevin step to propose local moves associated with the differentiable part of the target density with (ii) a shrinkage-thresholding step based on the non-differentiable part of the target density which provides sparse solutions such that small components are shrunk toward zero. This allows to sample from distributions on spaces with different dimensions by actually setting some components to zero. The performances of this new procedure are illustrated with both simulated and real data sets. The geometric ergodicity of this new transdimensional Markov Chain Monte Carlo sampler is also established.

(I will definitely get a look at the paper over the coming days!)

MCMC at ICMS (3)

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on April 26, 2012 by xi'an

The intense pace of the two first days of our workshop on MCMC at ICMS had apparently taken an heavy toll on the participants as a part of the audience was missing this morning! Although not as a consequence of the haggis of the previous night at the conference dinner, nor even as a result of the above pace. In fact, the missing participants had opted ahead of time for leaving the workshop early, which is understandable given everyone’s busy schedule, esp. for those attending both Bristol and Edinburgh workshops, however slightly impacting the atmosphere of the final day. (Except for Mark Girolami who most unfortunately suffered such a teeth infection that he had to seek urgent medical assistance yesterday afternoon. Best wishes to Mark for a prompt recovery, say I with a dental appointment tomorrow…!)

The plenary talk of the day was delivered by Heikki Haario, who provided us with a survey of the (adaptive) MCMC advances he and his collaborators had made in the analysis of complex and immensely high-dimensional weather models. This group of Finnish researchers, who started from inverse problem analysis rather than from MCMC, have had a major impact on the design and validation of adaptive MCMC algorithms, especially in the late 1990’s. (Heikki also was a co-organizer of the Adap’ski workshops, workshops that may be continued, stay tuned!) The next talk, by Marko Laine, was also about adaptive MCMC algorithms, with the difference that the application was climate modelling. It contained interesting directions about early stopping (“early rejection”, as opposed to “delayed rejection”) of diverging proposals (gaining 80% in computing time!) and about parallel adaptation. Still in the same theme, Gersende Fort explained the adaptive version of the equi-energy sampler she and co-authors had recently developed. Although she had briefly presented this paper in Banff a month ago, I found the talk quite informative about the implementation of the method and at the perfect technical level (for me!).

In [what I now perceive as] another recurrent theme of the workshop, namely the recourse to Gaussian structures like Gaussian processes (see, e.g., Ian Murray’s talk yesterday), Andrew Stuart gave us a light introduction to random walk Metropolis-Hastings algorithms on Hilbert spaces. In particular, he related to Ian Murray’s talk of yesterday as to the definition of a “new” random walk (due to Radford Neal)  that makes a proposal

y=\sqrt{1-\beta^2}x_{t-1}+\beta\zeta\quad 0<\beta<1,\zeta\sim\varphi(|\zeta|)

that still preserves the acceptance probability of the original (“old”) random walk proposal. The final talks of the morning were Krys Latuszynski’s and Nick Whiteley’s very pedagogical presentations of the convergence properties of manifold MALA and of particle filters for hidden Markov models.  In both cases, the speakers avoided the overly technical details and provided clear intuition in the presented results, a great feat after those three intense days of talks! (Having attended Nick’s talk in Paris two weeks ago helped of course.)

Unfortunately, due to very limited flight options (after one week of traveling around the UK) and also being slightly worried at the idea of missing my flight!, I had to leave the meeting along with all my French colleagues right after Jean-Michel Marin’s talk on (hidden) Potts driven mixtures, explaining the computational difficulties in deriving marginal likelihoods. I thus missed the final talk of the workshop by Gareth Tribello. And delivering my final remarks at the lunch break.

Overall, when reflecting on those two Monte Carlo workshops, I feel I preferred the pace of the Bristol workshop, because it allowed for more interactions between the participants by scheduling less talks… This being said, the organization at ICMS was superb (as usual!) and the talks were uniformly very good so it also was a very profitable meeting, of a different kind! As written earlier, among other things, it induced (in me) some reflections on a possible new research topic with friends there. Looking forward to visit Scotland again, of course!

Riemann, &tc.

Posted in Statistics, University life with tags , , , , , on November 4, 2010 by xi'an

The discussions of the discussions of members of my group at CREST [on the Read Paper by Girolami and Calderhead] have been collated and deposited on arXiv. This follows a set of discussions by Luke Bornn, Julien Cornebise  and Gareth Peters posted a few days ago.

Pre-ordinary meeting

Posted in R, Statistics, Travel, University life with tags , , , , , on October 8, 2010 by xi'an

Those are the slides for the (basic) introduction of the paper by Mark Girolami and Ben Calderhead at the RSS next week. Not to be confused with my comments on the paper.


Get every new post delivered to your Inbox.

Join 875 other followers