Archive for diffusions

MCMskv #4 [house with a vision]

Posted in Statistics with tags , , , , , , , , , , , , on January 9, 2016 by xi'an

OLYMPUS DIGITAL CAMERALast day at MCMskv! Not yet exhausted by this exciting conference, but this was the toughest day with one more session and a tutorial by Art Own on quasi Monte-Carlo. (Not even mentioning the night activities that I skipped. Or the ski break that I did not even consider.) Krys Latunszynski started with a plenary on exact methods for discretised diffusions, with a foray in Bernoulli factory problems. Then a neat session on adaptive MCMC methods that contained a talk by Chris Sherlock on delayed acceptance, where the approximation to the target was built by knn trees. (The adaptation was through the construction of the tree by including additional evaluations of the target density. Another paper sitting in my to-read list for too a long while: the exploitation of the observed values of π towards improving an MCMC sampler has always be “obvious” to me even though I could not see any practical way of doing so. )

It was wonderful that Art Owen accepted to deliver a tutorial at MCMskv on quasi-random Monte Carlo. Great tutorial, with a neat coverage of the issues most related to Monte Carlo integration. Since quasi-random sequences have trouble with accept/reject methods, a not-even-half-baked idea that came to me during Art’s tutorial was that the increased computing power granted by qMC could lead to a generic integration of the Metropolis-Hastings step in a Rao-Blackwellised manner. Art mentioned he was hoping that in a near future one could switch between pseudo- and quasi-random in an almost automated manner when running standard platforms like R. This would indeed be great, especially since quasi-random sequences seem to be available at the same cost as their pseudo-random counterpart. During the following qMC session, Art discussed the construction of optimal sequences on sets other than hypercubes (with the surprising feature that projecting optimal sequences from the hypercube does not work). Mathieu Gerber presented the quasi-random simulated annealing algorithm he developed with Luke Bornn that I briefly discussed a while ago. Or thought I did as I cannot trace a post on that paper! While the fact that annealing also works with quasi-random sequences is not astounding, the gain over random sequences shown on two examples is clear. The session also had a talk by Lester Mckey who relies Stein’s discrepancy to measure the value of an approximation to the true target. This was quite novel, with a surprising connection to Chris Oates’ talk and the use of score-based control variates, if used in a dual approach.

Another great session was the noisy MCMC one organised by Paul Jenkins (Warwick), with again a coherent presentation of views on the quality or lack thereof of noisy (or inexact) versions, with an update from Richard Everitt on inexact MCMC, Felipe Medina Aguayo (Warwick) on sufficient conditions for noisy versions to converge (and counterexamples), Jere Koskela (Warwick) on a pseudo-likelihood approach to the highly complex Kingman’s coalescent model in population genetics (of ABC fame!), and Rémi Bardenet on the tall data approximations techniques discussed in a recent post. Having seen or read most of those results previously did not diminish the appeal of the session.

MCMC at ICMS (2)

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on April 25, 2012 by xi'an

The second day of our workshop on computational statistics at the ICMS started with a terrific talk by Xiao-Li Meng. Although this talk related with his Inception talk in Paris last summer, and of the JCGS discussion paper, he brought new geometric aspects to the phenomenon (managing a zero correlation and hence i.i.d.-ness in the simulation of a Gaussian random effect posterior distribution). While I was reflecting about the difficulty to extend the perspective beyond normal models, he introduced a probit example where exact null correlation cannot be found but an adaptive scheme allows to explore the range of correlation coefficients. This made me somehow think of a possible version in this approach in a tempering perspective, where different data augmentation schemes would be merged into an “optimal” geometric mixture, rather than via interweaving.

As an aside, Xiao-Li mentioned the idea of Bayesian sufficiency and Bayesian ancilarity in the construction of his data augmentation schemes. He then concluded that sufficiency is identical in classical and Bayesian approaches, while ancilarity could be defined in several ways. I have already posted on that, but it seems to me that sufficiency is a weaker notion in the Bayesian perspective in the sense that all that matters is that the posterior is the same given the observation y and given the observed statistics, rather than uniformly over all possible values of the random variable Y as in the classical sense. As for ancilarity, it is also natural to consider that an ancillary statistics does not bring information on the parameter, i.e. that the prior and the posterior distributions are the same given the observed ancillary statistics. Going further to define ancilarity as posterior independence between “true” parameters and auxiliary variables, as Xiao-Li suggested, does not seem very sound as it leads to the paradoxes Basu liked so much!

Today, the overlap with the previous meetings in Bristol and in Banff was again limited: Arnaud Doucet rewrote his talk towards less technicity, which means I got the idea much more clearly than last week. The idea of having a sequence of pseudo-parameters with the same pseudo-prior seems to open a wide range of possible adaptive schemes. Faming Liang also gave a talk fairly similar to the one he presented in Banff. And David van Dyk as well, which led me to think anew about collapsed Gibbs samplers in connection with ABC and a project I just started here in Edinburgh.

Otherwise, the intense schedule of the day saw us through eleven talks. Daniele Impartato called for distributions (in the physics or Laurent Schwarz’ meaning of the term!) to decrease the variance of Monte Carlo estimations, an approach I hope to look further as Schwarz’ book is the first math book I ever bought!, an investment I tried to capitalize once in writing a paper mixing James-Stein estimation and distributions for generalised integration by part, paper that was repeatedly rejected until I gave up! Jim Griffin showed us improvements brought in the exploration of large number of potential covariates in linear and generalised linear models. Natesh Pillai tried to drag us through several of his papers on covariance matrix estimation, although I fear he lost me along the way! Let me perversely blame the schedule (rather than an early rise to run around Arthur’s Seat!) for falling asleep during Alex Beskos’ talk on Hamiltonian MCMC for diffusions, even though I was looking forward this talk. (Apologies to Alex!) Then Simon Byrne gave us a quick tour of differential geometry in connection with orthogonalization for Hamiltonian MCMC. Which brought me back very briefly to this early time I was still considering starting a PhD in differential geometry and then even more briefly played with the idea of mixing differential geometry and statistics à la Shun’ichi  Amari…. Ian Murray and  Simo Sarkka completed the day with a cartoonesque talk on latent Gaussians that connected well with Xiao-Li’s and a talk on Gaussian approximations to diffusions with unknown parameters, which kept within the main theme of the conference, namely inference on partly observed diffusions.

As written above, this was too intense a day, with hardly any free time to discuss about the talks or the ongoing projects, which makes me prefer the pace adopted in Bristol or in Banff. Having to meet a local student on leave from Dauphine for a year here did not help of course!)