## Langevin on a wrong bend

Posted in Books, Statistics with tags , , , , , , , on October 19, 2017 by xi'an

Arnak Dalayan and Avetik Karagulyan (CREST) arXived a paper the other week on a focussed study of the Langevin algorithm [not MALA] when the gradient of the target is incorrect. With the following improvements [quoting non-verbatim from the paper]:

1. a varying-step Langevin that reduces the number of iterations for a given Wasserstein precision, compared with recent results by e.g. Alan Durmus and Éric Moulines;
2. an extension of convergence results for error-prone evaluations of the gradient of the target (i.e., the gradient is replaced with a noisy version, under some moment assumptions that do not include unbiasedness);
3. a new second-order sampling algorithm termed LMCO’, with improved convergence properties.

What is particularly interesting to me in this setting is the use in all these papers of a discretised Langevin diffusion (a.k.a., random walk with a drift induced by the gradient of the log-target) without the original Metropolis correction. The results rely on an assumption of [strong?] log-concavity of the target, with “user-friendly” bounds on the Wasserstein distance depending on the constants appearing in this log-concavity constraint. And so does the adaptive step. (In the case of the noisy version, the bias and variance of the noise also matter. As pointed out by the authors, there is still applicability to scaling MCMC for large samples. Beyond pseudo-marginal situations.)

“…this, at first sight very disappointing behavior of the LMC algorithm is, in fact, continuously connected to the exponential convergence of the gradient descent.”

The paper concludes with an interesting mise en parallèle of Langevin algorithms and of gradient descent algorithms, since the convergence rates are the same.

## accelerating Metropolis-Hastings algorithms by delayed acceptance

Posted in Books, Statistics, University life with tags , , , , , , , , on March 5, 2015 by xi'an

Marco Banterle, Clara Grazian, Anthony Lee, and myself just arXived our paper “Accelerating Metropolis-Hastings algorithms by delayed acceptance“, which is an major revision and upgrade of our “Delayed acceptance with prefetching” paper of last June. Paper that we submitted at the last minute to NIPS, but which did not get accepted. The difference with this earlier version is the inclusion of convergence results, in particular that, while the original Metropolis-Hastings algorithm dominates the delayed version in Peskun ordering, the later can improve upon the original for an appropriate choice of the early stage acceptance step. We thus included a new section on optimising the design of the delayed step, by picking the optimal scaling à la Roberts, Gelman and Gilks (1997) in the first step and by proposing a ranking of the factors in the Metropolis-Hastings acceptance ratio that speeds up the algorithm.  The algorithm thus got adaptive. Compared with the earlier version, we have not pursued the second thread of prefetching as much, simply mentioning that prefetching and delayed acceptance could be merged. We have also included a section on the alternative suggested by Philip Nutzman on the ‘Og of using a growing ratio rather than individual terms, the advantage being the probability of acceptance stabilising when the number of terms grows, with the drawback being that expensive terms are not always computed last. In addition to our logistic and mixture examples, we also study in this version the MALA algorithm, since we can postpone computing the ratio of the proposals till the second step. The gain observed in one experiment is of the order of a ten-fold higher efficiency. By comparison, and in answer to one comment on Andrew’s blog, we did not cover the HMC algorithm, since the preliminary acceptance step would require the construction of a proxy to the acceptance ratio, in order to avoid computing a costly number of derivatives in the discretised Hamiltonian integration.

## high-dimensional stochastic simulation and optimisation in image processing [day #3]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on August 31, 2014 by xi'an

Last and maybe most exciting day of the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” in Bristol as it was exclusively about simulation (MCMC) methods. Except my own talk on ABC. And Peter Green’s on consistency of Bayesian inference in non-regular models. The talks today were indeed about using convex optimisation devices to speed up MCMC algorithms with tools that were entirely new to me, like the Moreau transform discussed by Marcelo Pereyra. Or using auxiliary variables à la RJMCMC to bypass expensive Choleski decompositions. Or optimisation steps from one dual space to the original space for the same reason. Or using pseudo-gradients on partly differentiable functions in the talk by Sylvain Lecorff on a paper commented earlier in the ‘Og. I particularly liked the notion of Moreau regularisation that leads to more efficient Langevin algorithms when the target is not regular enough. Actually, the discretised diffusion itself may be geometrically ergodic without the corrective step of the Metropolis-Hastings acceptance. This obviously begs the question of an extension to Hamiltonian Monte Carlo. And to multimodal targets, possibly requiring as many normalisation factors as there are modes. So, in fine, a highly informative workshop, with the perfect size and the perfect crowd (which happened to be predominantly French, albeit from a community I did not have the opportunity to practice previously). Massive kudos to Marcello for putting this workshop together, esp. on a week where family major happy events should have kept him at home!

As the workshop ended up in mid-afternoon, I had plenty of time for a long run with Florence Forbes down to the Avon river and back up among the deers of Ashton Court, avoiding most of the rain, all of the mountain bikes on a bike trail that sounded like trail running practice, and building enough of an appetite for the South Indian cooking of the nearby Thali Café. Brilliant!

## high-dimensional stochastic simulation and optimisation in image processing [day #2]

Posted in pictures, Statistics, Travel, Uncategorized, University life, Wines with tags , , , , , , on August 30, 2014 by xi'an

After a nice morning run down Leigh Woods and on the muddy banks of the Avon river, I attended a morning session on hyperspectral image non-linear modelling. Topic about which I knew nothing beforehand. Hyperspectral images are 3-D images made of several wavelengths to improve their classification as a mixture of several elements. The non-linearity is due to the multiple reflections from the ground as well as imperfections in the data collection. I found this new setting of clear interest, from using mixtures to exploring Gaussian processes and Hamiltonian Monte Carlo techniques on constrained spaces… Not to mention the “debate” about using Bayesian inference versus optimisation. It was overall a day of discovery as I am unaware of the image processing community (being the outlier in this workshop!) and of their techniques. The problems mostly qualify as partly linear high-dimension inverse problems, with rather standard if sometimes hybrid MCMC solutions. (The day ended even more nicely with another long run in the fields of Ashton Court and a conference diner by the river…)

## high-dimensional stochastic simulation and optimisation in image processing [day #1]

Posted in pictures, Statistics, Travel, Uncategorized, University life, Wines with tags , , , , , , , , , , , on August 29, 2014 by xi'an

Even though I flew through Birmingham (and had to endure the fundamental randomness of trains in Britain), I managed to reach the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” conference location (in Goldney Hall Orangery) in due time to attend the (second) talk by Christophe Andrieu. He started with an explanation of the notion of controlled Markov chain, which reminded me of our early and famous-if-unpublished paper on controlled MCMC. (The label “controlled” was inspired by Peter Green who pointed out to us the different meanings of controlled in French [meaning checked or monitored] and in English . We use it here in the English sense, obviously.) The main focus of the talk was on the stability of controlled Markov chains. With of course connections with out controlled MCMC of old, for instance the case of the coerced acceptance probability. Which happened to be not that stable! With the central tool being Lyapounov functions. (Making me wonder whether or not it would make sense to envision the meta-problem of adaptively estimating the adequate Lyapounov function from the MCMC outcome.)

As I had difficulties following the details of the convex optimisation talks in the afternoon, I eloped to work on my own and returned to the posters & wine session, where the small number of posters allowed for the proper amount of interaction with the speakers! Talking about the relevance of variational Bayes approximations and of possible tools to assess it, about the use of new metrics for MALA and of possible extensions to Hamiltonian Monte Carlo, about Bayesian modellings of fMRI and of possible applications of ABC in this framework. (No memorable wine to make the ‘Og!) Then a quick if reasonably hot curry and it was already bed-time after a rather long and well-filled day!z

## MCqMC 2014 [day #3]

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , , , on April 10, 2014 by xi'an

As the second day at MCqMC 2014, was mostly on multi-level Monte Carlo and quasi-Monte Carlo methods, I did not attend many talks but had a long run in the countryside (even saw a pheasant and a heron), worked at “home” on pressing recruiting evaluations and had a long working session with Pierre Jacob. Plus an evening out sampling (just) a few Belgian beers in the shade of the city hall…

Today was more in my ballpark as there were MCMC talks the whole day! The plenary talk was not about MCMC as Erich Novak presented a survey on the many available results bounding the complexity of approximating an integral based on a fixed number of evaluations of the integrand, some involving the dimension (and its curse), some not, some as fast as √n and some not as fast, all this depending on the regularity and the size of the classes of integrands considered. In some cases, the solution was importance sampling, in other cases, quasi-Monte Carlo, and yet other cases were still unsolved. Then Yves Atchadé gave a new perspective on computing the asymptotic variance in the central limit theorem on Markov chains when truncating the autocovariance, Matti Vihola talked about theoretical orderings of Markov chains that transmuted into the very practical consequence that using more simulations in a pseudo-marginal likelihood approximation improved acceptance rate and asymptotic variances (and this applies to aBC-MCMC as well), Radu Craiu proposed a novel processing of adaptive MCMC by treating various approximations to the true target as food for a multiple-try Metropolis algorithm, and Luca Martino had a go at resuscitating the ARMS algorithm of Gilks and Wild (used for a while in BUGS), although the talk did not dissipate all of my misgivings about the multidimensional version! I had more difficulties following the “Warwick session” which was made of four talks by current or former students from Warwick, although I appreciated the complexity of the results in infinite dimensional settings and novel approximations to diffusion based Metropolis algorithms. No further session this afternoon as the “social” activity was to visit the nearby Stella Artois brewery! This activity made us very social, for certain, even though there was hardly a soul around in this massively automated factory. (Maybe an ‘Og post to come one of those days…)

## MCMC at ICMS (3)

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on April 26, 2012 by xi'an

The intense pace of the two first days of our workshop on MCMC at ICMS had apparently taken an heavy toll on the participants as a part of the audience was missing this morning! Although not as a consequence of the haggis of the previous night at the conference dinner, nor even as a result of the above pace. In fact, the missing participants had opted ahead of time for leaving the workshop early, which is understandable given everyone’s busy schedule, esp. for those attending both Bristol and Edinburgh workshops, however slightly impacting the atmosphere of the final day. (Except for Mark Girolami who most unfortunately suffered such a teeth infection that he had to seek urgent medical assistance yesterday afternoon. Best wishes to Mark for a prompt recovery, say I with a dental appointment tomorrow…!)

$y=\sqrt{1-\beta^2}x_{t-1}+\beta\zeta\quad 0<\beta<1,\zeta\sim\varphi(|\zeta|)$