## irreversible Markov chains

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , on November 20, 2018 by xi'an

Werner Krauth (ENS, Paris) was in Dauphine today to present his papers on irreversible Markov chains at the probability seminar. He went back to the 1953 Metropolis et al. paper. And mentioned a 1962 paper I had never heard of by Alder and Wainwright demonstrating phase transition can occur, via simulation. The whole talk was about simulating the stationary distribution of a large number of hard spheres on a one-dimensional ring, which made it hard for me to understand. (Maybe the triathlon before did not help.) And even to realise a part was about PDMPs… His slides included this interesting entry on factorised MCMC which reminded me of delayed acceptance and thinning and prefetching. Plus a notion of lifted Metropolis that could have applications in a general setting, if it differs from delayed rejection.

## more multiple proposal MCMC

Posted in Books, Statistics with tags , , , , , , , on July 26, 2018 by xi'an

Luo and Tjelmeland just arXived a paper on a new version of multiple-try Metropolis Hastings, the addendum being in defining the additional proposed copies via a dependence graph like (a) above, with one version from the target and all others from operational and conditional proposal kernels. Respecting the dependence graph, as in (b). As I did, you may then wonder where both the graph and the conditional do come from. Which reminds me of the pseudo-posteriors of Carlin and Chib (1995), even though this is not terribly connected. Green (1995).) (But not disconnected either since the authors mention But, given the graph, following a Gibbs scheme, one of the 17 nodes is chosen as generated from the target, using the proper conditional on that index [which is purely artificial from the point of view of the original simulation problem!]. As noted above, the graph is an issue, but since it is artificial, it can be devised to either allow for quasi-independence between the proposed values or on the opposite to induce long range dependence, which corresponds to conducting multiple MCMC steps before reaching the end nodes, a feature that is very appealing in my opinion. And reminds me of prefetching. (As I am listening to Nicolas Chopin’s lecture in Warwick at the moment, there also seems to be a connection with pMCMC.) Still, I remain unclear as to the devising of the graph of dependent proposals, as its depth should be somehow connected with the mixing properties of the original proposal. Gains in convergence may thus come at a high cost at the construction stage.

## patterns of scalable Bayesian inference

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on February 24, 2016 by xi'an

Elaine Angelino, Matthew Johnson and Ryan Adams just arXived a massive survey of 118 pages on scalable Bayesian inference, which could have been entitled Bayes for Big Data, as this monograph covers state-of-the-art computational approaches to large and complex data structures. I did not read each and every line of it, but I have already recommended it to my PhD students. Some of its material unsurprisingly draws from the recent survey by Rémi Bardenet et al. (2015) I discussed a while ago. It also relates rather frequently to the somewhat parallel ICML paper of Korattikara et al. (2014). And to the firefly Monte Carlo procedure also discussed previously here.

Chapter 2 provides some standard background on computational techniques, Chapter 3 covers MCMC with data subsets, Chapter 4 gives some entries on MCMC with parallel and distributed architectures, Chapter 5 focus on variational solutions, and Chapter 6 is about open questions and challenges.

“Insisting on zero asymptotic bias from Monte Carlo estimates of expectations may leave us swamped in errors from high variance or transient bias.”

One central theme of the paper is the need for approximate solutions, MCMC being perceived as the exact solution. (Somewhat wrongly in the sense that the product of an MCMC is at best an empirical version of the true posterior, hence endowed with a residual and incompressible variation for a given computing budget.) While Chapter 3 stresses the issue of assessing the distance to the true posterior, it does not dwell at all on computing times and budget, which is arguably a much harder problem. Chapter 4 seems to be more aware of this issue since arguing that “a way to use parallel computing resources is to run multiple sequential MCMC algorithms at once [but that this] does not reduce the transient bias in MCMC estimates of posterior expectations” (p.54). The alternatives are to use either prefetching (which was the central theme of Elaine Angelino’s thesis), asynchronous Gibbs with the new to me (?) Hogwild Gibbs algorithms (connected in Terenin et al.’s recent paper, not quoted in the paper), some versions of consensus Monte Carlo covered in earlier posts, the missing links being in my humble opinion an assessment of the worth of those solutions (in the spirit of “here’s the solution, what was the problem again?”) and once again the computing time issue. Chapter 5 briefly discusses some recent developments in variational mean field approximations, which is farther from my interests and (limited) competence, but which appears as a particular class of approximate models and thus could (and should?) relate to likelihood-free methods. Chapter 6 about the current challenges of the field is presumably the most interesting in this monograph in that it produces open questions and suggests directions for future research. For instance, opposing the long term MCMC error with the short term transient part. Or the issue of comparing different implementations in a practical and timely perspective.

## accelerating Metropolis-Hastings algorithms by delayed acceptance

Posted in Books, Statistics, University life with tags , , , , , , , , on March 5, 2015 by xi'an

Marco Banterle, Clara Grazian, Anthony Lee, and myself just arXived our paper “Accelerating Metropolis-Hastings algorithms by delayed acceptance“, which is an major revision and upgrade of our “Delayed acceptance with prefetching” paper of last June. Paper that we submitted at the last minute to NIPS, but which did not get accepted. The difference with this earlier version is the inclusion of convergence results, in particular that, while the original Metropolis-Hastings algorithm dominates the delayed version in Peskun ordering, the later can improve upon the original for an appropriate choice of the early stage acceptance step. We thus included a new section on optimising the design of the delayed step, by picking the optimal scaling à la Roberts, Gelman and Gilks (1997) in the first step and by proposing a ranking of the factors in the Metropolis-Hastings acceptance ratio that speeds up the algorithm.  The algorithm thus got adaptive. Compared with the earlier version, we have not pursued the second thread of prefetching as much, simply mentioning that prefetching and delayed acceptance could be merged. We have also included a section on the alternative suggested by Philip Nutzman on the ‘Og of using a growing ratio rather than individual terms, the advantage being the probability of acceptance stabilising when the number of terms grows, with the drawback being that expensive terms are not always computed last. In addition to our logistic and mixture examples, we also study in this version the MALA algorithm, since we can postpone computing the ratio of the proposals till the second step. The gain observed in one experiment is of the order of a ten-fold higher efficiency. By comparison, and in answer to one comment on Andrew’s blog, we did not cover the HMC algorithm, since the preliminary acceptance step would require the construction of a proxy to the acceptance ratio, in order to avoid computing a costly number of derivatives in the discretised Hamiltonian integration.

## NIPS workshops (Dec. 12-13, 2014, Montréal)

Posted in Kids, Statistics, Travel, University life with tags , , , , , , , , on August 25, 2014 by xi'an

Following a proposal put forward by Ted Meeds, Max Welling,  Richard Wilkinson, Neil Lawrence and myself, our ABC in Montréal workshop has been accepted by the NIPS 2014 committee and will thus take place on either Friday, Dec. 11, or Saturday, Dec. 12, at the end of the main NIPS meeting (Dec. 8-10). (Despite the title, this workshop is not part of the ABC in … series I started five years ago. It will only last a single day with a few invited talks and no poster. And no free wine & cheese party.) On top of this workshop, our colleagues Vikash K Mansinghka, Daniel M Roy, Josh Tenenbaum, Thomas Dietterich, and Stuart J Russell have also been successful in their bid for the 3rd NIPS Workshop on Probabilistic Programming which will presumably be held on the opposite day to ours, as Vikash is speaking at our workshop, while I am speaking in this workshop. I am yet undecided as to whether or not to attend the main conference, given that I am already travelling a lot this semester and have to teach two courses, incl. a large undergraduate statistics inference course… Obviously, I will try to attend if our joint paper is accepted by the editorial board! Even though Marco will then be the speaker.