Archive for Langevin diffusion

computational methods for statistical mechanics [day #4]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on June 7, 2014 by xi'an

Arthur Seat, Edinburgh, Sep. 7, 2011

My last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur’s Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

“The alanine dipeptide is the fruit fly of molecular simulation.”

The example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

MCqMC 2014 [day #3]

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , , , on April 10, 2014 by xi'an

Leuven2

As the second day at MCqMC 2014, was mostly on multi-level Monte Carlo and quasi-Monte Carlo methods, I did not attend many talks but had a long run in the countryside (even saw a pheasant and a heron), worked at “home” on pressing recruiting evaluations and had a long working session with Pierre Jacob. Plus an evening out sampling (just) a few Belgian beers in the shade of the city hall…

Today was more in my ballpark as there were MCMC talks the whole day! The plenary talk was not about MCMC as Erich Novak presented a survey on the many available results bounding the complexity of approximating an integral based on a fixed number of evaluations of the integrand, some involving the dimension (and its curse), some not, some as fast as √n and some not as fast, all this depending on the regularity and the size of the classes of integrands considered. In some cases, the solution was importance sampling, in other cases, quasi-Monte Carlo, and yet other cases were still unsolved. Then Yves Atchadé gave a new perspective on computing the asymptotic variance in the central limit theorem on Markov chains when truncating the autocovariance, Matti Vihola talked about theoretical orderings of Markov chains that transmuted into the very practical consequence that using more simulations in a pseudo-marginal likelihood approximation improved acceptance rate and asymptotic variances (and this applies to aBC-MCMC as well), Radu Craiu proposed a novel processing of adaptive MCMC by treating various approximations to the true target as food for a multiple-try Metropolis algorithm, and Luca Martino had a go at resuscitating the ARMS algorithm of Gilks and Wild (used for a while in BUGS), although the talk did not dissipate all of my misgivings about the multidimensional version! I had more difficulties following the “Warwick session” which was made of four talks by current or former students from Warwick, although I appreciated the complexity of the results in infinite dimensional settings and novel approximations to diffusion based Metropolis algorithms. No further session this afternoon as the “social” activity was to visit the nearby Stella Artois brewery! This activity made us very social, for certain, even though there was hardly a soul around in this massively automated factory. (Maybe an ‘Og post to come one of those days…)

new MCMC algorithm for Bayesian variable selection

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on February 25, 2014 by xi'an

Flight from Bristol to Amsterdam, April 03, 2011Unfortunately, I will miss the incoming Bayes in Paris seminar next Thursday (27th February), as I will be flying to Montréal and then Québec at the time (despite having omitted to book a flight till now!). Indeed Amandine Shreck will give a talk at 2pm in room 18 of ENSAE, Malakoff, on A shrinkage-thresholding Metropolis adjusted Langevin algorithm for Bayesian variable selection, a work written jointly with Gersende Fort, Sylvain Le Corff, and Eric Moulines, and arXived at the end of 2013 (which may explain why I missed it!). Here is the abstract:

This paper introduces a new Markov Chain Monte Carlo method to perform Bayesian variable selection in high dimensional settings. The algorithm is a Hastings-Metropolis sampler with a proposal mechanism which combines (i) a Metropolis adjusted Langevin step to propose local moves associated with the differentiable part of the target density with (ii) a shrinkage-thresholding step based on the non-differentiable part of the target density which provides sparse solutions such that small components are shrunk toward zero. This allows to sample from distributions on spaces with different dimensions by actually setting some components to zero. The performances of this new procedure are illustrated with both simulated and real data sets. The geometric ergodicity of this new transdimensional Markov Chain Monte Carlo sampler is also established.

(I will definitely get a look at the paper over the coming days!)

MCMC at ICMS (3)

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on April 26, 2012 by xi'an

The intense pace of the two first days of our workshop on MCMC at ICMS had apparently taken an heavy toll on the participants as a part of the audience was missing this morning! Although not as a consequence of the haggis of the previous night at the conference dinner, nor even as a result of the above pace. In fact, the missing participants had opted ahead of time for leaving the workshop early, which is understandable given everyone’s busy schedule, esp. for those attending both Bristol and Edinburgh workshops, however slightly impacting the atmosphere of the final day. (Except for Mark Girolami who most unfortunately suffered such a teeth infection that he had to seek urgent medical assistance yesterday afternoon. Best wishes to Mark for a prompt recovery, say I with a dental appointment tomorrow…!)

The plenary talk of the day was delivered by Heikki Haario, who provided us with a survey of the (adaptive) MCMC advances he and his collaborators had made in the analysis of complex and immensely high-dimensional weather models. This group of Finnish researchers, who started from inverse problem analysis rather than from MCMC, have had a major impact on the design and validation of adaptive MCMC algorithms, especially in the late 1990’s. (Heikki also was a co-organizer of the Adap’ski workshops, workshops that may be continued, stay tuned!) The next talk, by Marko Laine, was also about adaptive MCMC algorithms, with the difference that the application was climate modelling. It contained interesting directions about early stopping (“early rejection”, as opposed to “delayed rejection”) of diverging proposals (gaining 80% in computing time!) and about parallel adaptation. Still in the same theme, Gersende Fort explained the adaptive version of the equi-energy sampler she and co-authors had recently developed. Although she had briefly presented this paper in Banff a month ago, I found the talk quite informative about the implementation of the method and at the perfect technical level (for me!).

In [what I now perceive as] another recurrent theme of the workshop, namely the recourse to Gaussian structures like Gaussian processes (see, e.g., Ian Murray’s talk yesterday), Andrew Stuart gave us a light introduction to random walk Metropolis-Hastings algorithms on Hilbert spaces. In particular, he related to Ian Murray’s talk of yesterday as to the definition of a “new” random walk (due to Radford Neal)  that makes a proposal

y=\sqrt{1-\beta^2}x_{t-1}+\beta\zeta\quad 0<\beta<1,\zeta\sim\varphi(|\zeta|)

that still preserves the acceptance probability of the original (“old”) random walk proposal. The final talks of the morning were Krys Latuszynski’s and Nick Whiteley’s very pedagogical presentations of the convergence properties of manifold MALA and of particle filters for hidden Markov models.  In both cases, the speakers avoided the overly technical details and provided clear intuition in the presented results, a great feat after those three intense days of talks! (Having attended Nick’s talk in Paris two weeks ago helped of course.)

Unfortunately, due to very limited flight options (after one week of traveling around the UK) and also being slightly worried at the idea of missing my flight!, I had to leave the meeting along with all my French colleagues right after Jean-Michel Marin’s talk on (hidden) Potts driven mixtures, explaining the computational difficulties in deriving marginal likelihoods. I thus missed the final talk of the workshop by Gareth Tribello. And delivering my final remarks at the lunch break.

Overall, when reflecting on those two Monte Carlo workshops, I feel I preferred the pace of the Bristol workshop, because it allowed for more interactions between the participants by scheduling less talks… This being said, the organization at ICMS was superb (as usual!) and the talks were uniformly very good so it also was a very profitable meeting, of a different kind! As written earlier, among other things, it induced (in me) some reflections on a possible new research topic with friends there. Looking forward to visit Scotland again, of course!

Riemann, &tc.

Posted in Statistics, University life with tags , , , , , on November 4, 2010 by xi'an

The discussions of the discussions of members of my group at CREST [on the Read Paper by Girolami and Calderhead] have been collated and deposited on arXiv. This follows a set of discussions by Luke Bornn, Julien Cornebise  and Gareth Peters posted a few days ago.

Pre-ordinary meeting

Posted in R, Statistics, Travel, University life with tags , , , , , on October 8, 2010 by xi'an

Those are the slides for the (basic) introduction of the paper by Mark Girolami and Ben Calderhead at the RSS next week. Not to be confused with my comments on the paper.

Riemann, Langevin & Hamilton

Posted in Statistics, University life with tags , , , , , on September 27, 2010 by xi'an

In preparation for the Read Paper session next month at the RSS, our research group at CREST has collectively read the Girolami and Calderhead paper on Riemann manifold Langevin and Hamiltonian Monte Carlo methods and I hope we will again produce a joint arXiv preprint out of our comments. (The above picture is reproduced from Radford Neal’s talk at JSM 1999 in Baltimore, talk that I remember attending…) Although this only represents my preliminary/early impression on the paper, I have trouble with the Physics connection. Because it involves continuous time events that are not transcribed directly into the simulation process.

Overall, trying to take advantage of second order properties of the target—just like the Langevin improvement takes advantage of the first order—is a natural idea which, when implementable, can obviously speed up convergence. This is the Langevin part, which may use a fixed metric M or a local metric defining a Riemann manifold, G(θ). So far, so good, assuming the derivation of an observed or expected information G(θ) is feasible up to some approximation level. The Hamiltonian part that confuses me introduces a dynamic on level sets of

\mathscr{H}(\theta,\mathbf{p})=-\mathcal{L}(\theta)+\frac{1}{2}\log\{(2\pi)^D|\mathbf{G}(\theta)|\}+\frac{1}{2}\mathbf{p}^\text{T}\mathbf{G}(\theta)^{-1}\mathbf{p}

where p is an auxiliary vector of dimension D. Namely,

\dot{\mathbf{p}} = \dfrac{\partial \mathscr{H}}{\partial \mathbf{p}}(\theta,\mathbf{p})\,,\qquad\dot{\theta}=\dfrac{\partial \mathscr{H}}{\partial \theta}(\theta,\mathbf{p})\,.

While I understand the purpose of the auxiliary vector, namely to speed up the exploration of the posterior surface by taking advantage of the additional energy provided by p, I fail to understand why the fact that the discretised (Euler) approximation to Hamilton’s equations is not available in closed form is such an issue…. The fact that the (deterministic?) leapfrog integrator is not exact should not matter since this can be corrected by a Metropolis-Hastings step.

While the logistic example is mostly a toy problem (where importance sampling works extremely well, as shown in our survey with Jean-Michel Marin), the stochastic volatility is more challenging and the fact that the Hamiltonian scheme applies to the missing data (volatility) as well as to the three parameters of the model is quite interesting. I however wonder at the appeal of this involved scheme when considering that the full conditional of the volatility can be simulated exactly

Follow

Get every new post delivered to your Inbox.

Join 794 other followers