Archive for Leuven
As mentioned earlier, this was my very first MCqMC conference and I really enjoyed it, even though (or because) there were many topics that did not fall within my areas of interest. (By comparison, WSC is a serie of conferences too remote from those areas for my taste, as I realised in Berlin where we hardly attended any talk and hardly anyone attended my session!) Here I appreciated the exposure to different mathematical visions on Monte Carlo, without being swamped by applications as at WSC… Obviously, our own Bayesian computational community was much less represented than at, say, MCMSki! Nonetheless, I learned a lot during this conference for instance from Peter Glynn‘s fantastic talk, and I came back home with new problems and useful references [as well as a two-hour delay in the train ride from Brussels]. I also obviously enjoyed the college-town atmosphere of Leuven, the many historical landmarks and the easily-found running routes out of the town. I am thus quite eager to attend the next MCqMC 2016 meeting (in Stanford, an added bonus!) and even vaguely toying with the idea of organising MCqMC 2018 in Monaco (depending on the return for ISBA 2016 and ISBA 2018). In any case, thanks to the scientific committee for the invitation to give a plenary lecture in Leuven and to the local committee for a perfect organisation of the meeting.
“At equilibrium, we thus should not expect gains of several orders of magnitude.”
As was signaled to me several times during the MCqMC conference in Leuven, Rémi Bardenet, Arnaud Doucet and Chris Holmes (all from Oxford) just wrote a short paper for the proceedings of ICML on a way to speed up Metropolis-Hastings by reducing the number of terms one computes in the likelihood ratio involved in the acceptance probability, i.e.
The observations appearing in this likelihood ratio are a random subsample from the original sample. Even though this leads to an unbiased estimator of the true log-likelihood sum, this approach is not justified on a pseudo-marginal basis à la Andrieu-Roberts (2009). (Writing this in the train back to Paris, I am not convinced this approach is in fact applicable to this proposal as the likelihood itself is not estimated in an unbiased manner…)
In the paper, the quality of the approximation is evaluated by Hoeffding’s like inequalities, which serves as the basis for a stopping rule on the number of terms eventually evaluated in the random subsample. In fine, the method uses a sequential procedure to determine if enough terms are used to take the decision and the probability to take the same decision as with the whole sample is bounded from below. The sequential nature of the algorithm requires to either recompute the vector of likelihood terms for the previous value of the parameter or to store all of them for deriving the partial ratios. While the authors adress the issue of self-evaluating whether or not this complication is worth the effort, I wonder (from my train seat) why they focus so much on recovering the same decision as with the complete likelihood ratio and the same uniform. It would suffice to get the same distribution for the decision (an alternative that is easier to propose than to create of course). I also (idly) wonder if a Gibbs version would be manageable, i.e. by changing only some terms in the likelihood ratio at each iteration, in which case the method could be exact… (I found the above quote quite relevant as, in an alternative technique we are constructing with Marco Banterle, the speedup is particularly visible in the warmup stage.) Hence another direction in this recent flow of papers attempting to speed up MCMC methods against the incoming tsunami of “Big Data” problems.