Chris Oates, Theodore Papamarkou, and Mark Girolami (all from the University of Warwick) just arXived a paper on a new form of thermodynamic integration for computing marginal likelihoods. (I had actually discussed this paper with the authors on a few occasions when visiting Warwick.) The other name of thermodynamic integration is path sampling (Gelman and Meng, 1998). In the current paper, the path goes from the prior to the posterior by a sequence of intermediary distributions using a power of the likelihood. While the path sampling technique is quite efficient a method, the authors propose to improve it through the recourse to control variates, in order to decrease the variance. The control variate is taken from Mira et al. (2013), namely a one-dimensional temperature-dependent transform of the score function. (Strictly speaking, this is an asymptotic control variate in that the mean is only asymptotically zero.) This control variate is then incorporated within the expectation inside the path sampling integral. Its arbitrary elements are then calibrated against the variance of the path sampling integral. Except for the temperature ladder where the authors use a standard geometric rate, as the approach does not account for Monte Carlo and quadrature errors. (The degree of the polynomials used in the control variates is also arbitrarily set.) Interestingly, the paper mixes a lot of recent advances, from the zero variance notion of Mira et al. (2013) to the manifold Metropolis-adjusted Langevin algorithm of Girolami and Calderhead (2011), uses as a base method pMCMC (Jasra et al., 2007). The examples processed in the paper are regression (where the controlled version truly has a zero variance!) and logistic regression (with the benchmarked Pima Indian dataset), with a counter-example of a PDE interestingly proposed in the discussion section. I quite agree with the authors that the method is difficult to envision in complex enough models. I also did not see mentions therein of the extra time involved in using this control variate idea.
Archive for the Statistics Category
First day at AISTATS 2014! After three Icelandic vacations days driving (a lot) and hinkg (too little) around South- and West-Iceland, I joined close to 300 attendees for this edition of the AISTATS conference series. I was quite happy to be there, if only because I had missed the conference last year (in Phoenix) and did not want this to become a tradition… Second, the mix of statistics, artificial intelligence and machine learning that characterises this conference is quite exciting, if challenging at time. What I most appreciated in this discovery of the conference is the central importance of the poster session, most talks being actually introductions to or oral presentations of posters! I find this feature terrific enough (is there such a notion as “terrific enough”?!) worth adopting in future conferences I am involved in. I just wish I had managed to tour the whole collection of posters today… The (first and) plenary lecture was delivered by Peter Bühlman, who spoke about a compelling if unusual (for me) version of causal inference. This was followed by sessions on Gaussian processes, graphical models, and mixed data sources. One highlight talk was the one by Marc Deisenroth, who showed impressive robotic fast learning based on Gaussian processes. At the end of this full day, I also attended an Amazon mixer where I learned about Amazon‘s entry on the local market, where it seems the company is getting a better picture of the current and future state of the U.S. economy than governmental services, thanks to a very fine analysis of the sales and entries on Amazon‘s entry. Then it was time to bike “home” on my rental bike, in the setting sun…
- Dan Simpson replied to my comments of last Tuesday about his PC construction;
- Arnaud Doucet precised some issues about his adaptive subsampling paper;
- Amandine Schreck clarified why I had missed some points in her Bayesian variable selection paper;
- Randal Douc defended the efficiency of using Carlin and Chib (1995) method for mixture simulation.
Thanks to them for taking the time to answer my musings…
Today, I am leaving Paris for a 8 day stay in Iceland! This is quite exciting, for many reasons: first, I missed the AISTATS 2013 last year as I was still in the hospital; second, I am giving a short short tutorial on ABC methods which will be more like a long (two hours) talk; third, it gives me the fantastic opportunity to visit Iceland for a few days, a place that was top of my wish list of countries to visit. The weather forecast is rather bleak but I am carrying enough waterproof layers to withstand a wee bit of snow and rain… The conference proper starts next Tuesday, April 22, with the tutorials taking place next Friday, April 25. Hence leaving me three completely free days for exploring the area near Reykjavik.
Daniel Simpson gave a seminar at CREST yesterday on his recently arXived paper, “Penalising model component complexity: A principled, practical approach to constructing priors” written with Thiago Martins, Andrea Riebler, Håvard Rue, and Sigrunn Sørbye. Paper that he should also have given in Banff last month had he not lost his passport in København airport… I have already commented at length on this exciting paper, hopefully to become a discussion paper in a top journal!, so I am just pointing out two things that came to my mind during the energetic talk delivered by Dan to our group. The first thing is that those penalised complexity (PC) priors of theirs rely on some choices in the ordering of the relevance, complexity, nuisance level, &tc. of the parameters, just like reference priors. While Dan already wrote a paper on Russian roulette, there is also a Russian doll principle at work behind (or within) PC priors. Each shell of the Russian doll corresponds to a further level of complexity whose order need be decided by the modeller… Not very realistic in a hierarchical model with several types of parameters having only local meaning.
My second point is that the construction of those “politically correct” (PC) priors reflects another Russian doll structure, namely one of embedded models, hence would and should lead to a natural multiple testing methodology. Except that Dan rejected this notion during his talk, by being opposed to testing per se. (A good topic for one of my summer projects, if nothing more, then!)