Archive for ICMS
The intense pace of the two first days of our workshop on MCMC at ICMS had apparently taken an heavy toll on the participants as a part of the audience was missing this morning! Although not as a consequence of the haggis of the previous night at the conference dinner, nor even as a result of the above pace. In fact, the missing participants had opted ahead of time for leaving the workshop early, which is understandable given everyone’s busy schedule, esp. for those attending both Bristol and Edinburgh workshops, however slightly impacting the atmosphere of the final day. (Except for Mark Girolami who most unfortunately suffered such a teeth infection that he had to seek urgent medical assistance yesterday afternoon. Best wishes to Mark for a prompt recovery, say I with a dental appointment tomorrow…!)
The plenary talk of the day was delivered by Heikki Haario, who provided us with a survey of the (adaptive) MCMC advances he and his collaborators had made in the analysis of complex and immensely high-dimensional weather models. This group of Finnish researchers, who started from inverse problem analysis rather than from MCMC, have had a major impact on the design and validation of adaptive MCMC algorithms, especially in the late 1990′s. (Heikki also was a co-organizer of the Adap’ski workshops, workshops that may be continued, stay tuned!) The next talk, by Marko Laine, was also about adaptive MCMC algorithms, with the difference that the application was climate modelling. It contained interesting directions about early stopping (“early rejection”, as opposed to “delayed rejection”) of diverging proposals (gaining 80% in computing time!) and about parallel adaptation. Still in the same theme, Gersende Fort explained the adaptive version of the equi-energy sampler she and co-authors had recently developed. Although she had briefly presented this paper in Banff a month ago, I found the talk quite informative about the implementation of the method and at the perfect technical level (for me!).
In [what I now perceive as] another recurrent theme of the workshop, namely the recourse to Gaussian structures like Gaussian processes (see, e.g., Ian Murray’s talk yesterday), Andrew Stuart gave us a light introduction to random walk Metropolis-Hastings algorithms on Hilbert spaces. In particular, he related to Ian Murray’s talk of yesterday as to the definition of a “new” random walk (due to Radford Neal) that makes a proposal
that still preserves the acceptance probability of the original (“old”) random walk proposal. The final talks of the morning were Krys Latuszynski’s and Nick Whiteley’s very pedagogical presentations of the convergence properties of manifold MALA and of particle filters for hidden Markov models. In both cases, the speakers avoided the overly technical details and provided clear intuition in the presented results, a great feat after those three intense days of talks! (Having attended Nick’s talk in Paris two weeks ago helped of course.)
Unfortunately, due to very limited flight options (after one week of traveling around the UK) and also being slightly worried at the idea of missing my flight!, I had to leave the meeting along with all my French colleagues right after Jean-Michel Marin’s talk on (hidden) Potts driven mixtures, explaining the computational difficulties in deriving marginal likelihoods. I thus missed the final talk of the workshop by Gareth Tribello. And delivering my final remarks at the lunch break.
Overall, when reflecting on those two Monte Carlo workshops, I feel I preferred the pace of the Bristol workshop, because it allowed for more interactions between the participants by scheduling less talks… This being said, the organization at ICMS was superb (as usual!) and the talks were uniformly very good so it also was a very profitable meeting, of a different kind! As written earlier, among other things, it induced (in me) some reflections on a possible new research topic with friends there. Looking forward to visit Scotland again, of course!
The second day of our workshop on computational statistics at the ICMS started with a terrific talk by Xiao-Li Meng. Although this talk related with his Inception talk in Paris last summer, and of the JCGS discussion paper, he brought new geometric aspects to the phenomenon (managing a zero correlation and hence i.i.d.-ness in the simulation of a Gaussian random effect posterior distribution). While I was reflecting about the difficulty to extend the perspective beyond normal models, he introduced a probit example where exact null correlation cannot be found but an adaptive scheme allows to explore the range of correlation coefficients. This made me somehow think of a possible version in this approach in a tempering perspective, where different data augmentation schemes would be merged into an “optimal” geometric mixture, rather than via interweaving.
As an aside, Xiao-Li mentioned the idea of Bayesian sufficiency and Bayesian ancilarity in the construction of his data augmentation schemes. He then concluded that sufficiency is identical in classical and Bayesian approaches, while ancilarity could be defined in several ways. I have already posted on that, but it seems to me that sufficiency is a weaker notion in the Bayesian perspective in the sense that all that matters is that the posterior is the same given the observation y and given the observed statistics, rather than uniformly over all possible values of the random variable Y as in the classical sense. As for ancilarity, it is also natural to consider that an ancillary statistics does not bring information on the parameter, i.e. that the prior and the posterior distributions are the same given the observed ancillary statistics. Going further to define ancilarity as posterior independence between “true” parameters and auxiliary variables, as Xiao-Li suggested, does not seem very sound as it leads to the paradoxes Basu liked so much!
Today, the overlap with the previous meetings in Bristol and in Banff was again limited: Arnaud Doucet rewrote his talk towards less technicity, which means I got the idea much more clearly than last week. The idea of having a sequence of pseudo-parameters with the same pseudo-prior seems to open a wide range of possible adaptive schemes. Faming Liang also gave a talk fairly similar to the one he presented in Banff. And David van Dyk as well, which led me to think anew about collapsed Gibbs samplers in connection with ABC and a project I just started here in Edinburgh.
Otherwise, the intense schedule of the day saw us through eleven talks. Daniele Impartato called for distributions (in the physics or Laurent Schwarz’ meaning of the term!) to decrease the variance of Monte Carlo estimations, an approach I hope to look further as Schwarz’ book is the first math book I ever bought!, an investment I tried to capitalize once in writing a paper mixing James-Stein estimation and distributions for generalised integration by part, paper that was repeatedly rejected until I gave up! Jim Griffin showed us improvements brought in the exploration of large number of potential covariates in linear and generalised linear models. Natesh Pillai tried to drag us through several of his papers on covariance matrix estimation, although I fear he lost me along the way! Let me perversely blame the schedule (rather than an early rise to run around Arthur’s Seat!) for falling asleep during Alex Beskos’ talk on Hamiltonian MCMC for diffusions, even though I was looking forward this talk. (Apologies to Alex!) Then Simon Byrne gave us a quick tour of differential geometry in connection with orthogonalization for Hamiltonian MCMC. Which brought me back very briefly to this early time I was still considering starting a PhD in differential geometry and then even more briefly played with the idea of mixing differential geometry and statistics à la Shun’ichi Amari…. Ian Murray and Simo Sarkka completed the day with a cartoonesque talk on latent Gaussians that connected well with Xiao-Li’s and a talk on Gaussian approximations to diffusions with unknown parameters, which kept within the main theme of the conference, namely inference on partly observed diffusions.
As written above, this was too intense a day, with hardly any free time to discuss about the talks or the ongoing projects, which makes me prefer the pace adopted in Bristol or in Banff. Having to meet a local student on leave from Dauphine for a year here did not help of course!)
So this is the second meeting on computational statistics in a row for me and several other participants! Now in Edinburgh, in the terrific location of the ICMS, near the University of Edinburgh. The overlap with the previous meeting in Bristol is actually very limited and I only saw yesterday a talk by Nial Friel I had already heard in Bristol (plus one from Jim Hobert he delivered in Banff!). And of course most of the participants were not in Bristol, so got the most from these talks. The day went on quite smoothly and quickly, despite the tight schedule, and we managed to keep to this schedule within the five minute confidence band… Gareth Roberts gave the first talk of the day on an overview of convergence speeds for Gibbs samplers that insisted on the importance of the decomposition of the model into hierarchical components. In connection with one of Xiao-Li Meng’s favourite themes of getting different convergence behaviours for different conditional decompositions. Christophe Andrieu and Jim Hobert kept to the same theme of convergence properties of MCMC samplers, Christophe developing a recent work about using two Lyapounov control functions to assess adaptive MCMC. The second theme of the day was connected with normalising constants, with Yves Atchadé expanding on path sampling to construct confidence evaluations and Niel Friel comparing auxiliary variable techniques with ABC approximations. (The path sampling equality is a magical mystery to me: magical because the equality is true, mystery because the implementation depends very much on calibration choices that are both delicate and influential. Yves addressed the impact of the discretisation in the error.) Nicolas Chopin also considered approximation impacts on long-memory process estimation, where I think ABC could come as a calibration (something we have to discuss at CREST when we are back). Omiros Papaspiliopoulos gave his talk on the same paper Gareth presented in Banff and Bristol, but using his own perspective, which made the presentation quite worthwhile. Darren Wilkinson and Andrew Golightly talked about the complexity of conducting inference for biochemical Markov processes relating with SDE’s, again evaluating the impact of approximations. Andrew covered in particular a delayed rejection or rather acceptance method where a substitute is avoiding computing the complex target by rejecting the most unlikely values (with the drawback of having two acceptance steps) Maria de Iorio introduced us to metabonomics (“the latest of the onimcs”!) with models that relate to spectral analysis (and thus reminded me of some astronomy models) and to wavelets (for the background noise), the estimation procedure seemingly related to source separation techniques found in signal processing (?). And Dan Lawson ended up the first day session with a fairly original presentation of the Dirichlet process in population genetics: this was the first talk I ever saw with the picture of a vomiting monster (see below for the Warhammer monster)…! During the poster session, Ian Murray provided us with a quick explanation of his work on expanding MCMC validation behind proper random generators, a paper I wanted to discuss here. And will certainly now that it has been pre-processed for me!
Last day of a great workshop! I filled more pages of my black notebook (“bloc”) than in the past month!!! This morning started with an Hamiltonian session, Paul Fearnhead presenting recent developments in this area. I liked his coverage very much, esp. because it went away from the physics analogies that always put me off. The idea of getting away from the quadratic form had always seemed natural to me and provided an interesting range for investigations. (I think I rediscovered the topic during the talks, rephrasing almost the same questions as for Girolami’s and Calderhead’s Read Paper!) One thing that still intrigues me is the temporal dimension of the Hamiltonian representation. Indeed, it is “free” in the sense that the simulation problem does not depend on the time the pair (x,p) is moved along the equipotential curve. (In practice, there is a cost in running this move because it needs to be discretised.) But there is no clear target function to set the time “right”. The only scale I can think of is when the pair comes back by its starting point. Which is less silly than it sounds because the discretisation means that all intermediate points can be used, as suggested by Paul via a multiple try scheme. Mark then presented an application of Hamiltonian ideas and schemes to biochemical dynamics, with a supplementary trick of linearisation. Christian Lorenz Müller gave an ambitious grand tour of gradient free optimisation techniques that sounded appealing from a simulation perspective (but would require a few more hours to apprehend!), Geoff Nicholls presented on-going research on approximating Metropolis-Hastings acceptance probabilities in a more general perspective than à la Andrieu-Robert, i.e. accepting some amount of bias, an idea he has explained to me when I visited Oxford. And Pierre Jacob concluded the meeting in the right tone with a pot-pourri of his papers on Wang-Landau. (Once again a talk I had already heard but that helped me make more sense of a complex notion…)
Overall and talk-by-talk, a truly exceptional meeting. Which also set the bar quite high for us to compete at the ICMS meeting on advances in MCM next Monday! Esp. when a portion of the audience in Bristol will appear in Edinburgh as well!an In the meanwhile, I have to rewrite my talk for the seminar in Glasgow tomorrow in order to remove the overlap with my talk there last year… (I note that I have just managed to fly to Scotland with no lost bag, a true achievement!)