Archive for University of Toronto

likelihood inflating sampling algorithm

Posted in Books, Statistics, University life with tags , , , , , , , , on May 24, 2016 by xi'an

My friends from Toronto Radu Craiu and Jeff Rosenthal have arXived a paper along with Reihaneh Entezari on MCMC scaling for large datasets, in the spirit of Scott et al.’s (2013) consensus Monte Carlo. They devised an likelihood inflated algorithm that brings a novel perspective to the problem of large datasets. This question relates to earlier approaches like consensus Monte Carlo, but also kernel and Weierstrass subsampling, already discussed on this blog, as well as current research I am conducting with my PhD student Changye Wu. The approach by Entezari et al. is somewhat similar to consensus Monte Carlo and the other solutions in that they consider an inflated (i.e., one taken to the right power) likelihood based on a subsample, with the full sample being recovered by importance sampling. Somewhat unsurprisingly this approach leads to a less dispersed estimator than consensus Monte Carlo (Theorem 1). And the paper only draws a comparison with that sub-sampling method, rather than covering other approaches to the problem, maybe because this is the most natural connection, one approach being the k-th power of the other approach.

“…we will show that [importance sampling] is unnecessary in many instances…” (p.6)

An obvious question that stems from the approach is the call for importance sampling, since the numerator of the importance sampler involves the full likelihood which is unavailable in most instances when sub-sampled MCMC is required. I may have missed the part of the paper where the above statement is discussed, but the only realistic example discussed therein is the Bayesian regression tree (BART) of Chipman et al. (1998). Which indeed constitutes a challenging if one-dimensional example, but also one that requires delicate tuning that leads to cancelling importance weights but which may prove delicate to extrapolate to other models.

Sampling latent states for high-dimensional non-linear state space models with the embedded HMM method

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on March 17, 2016 by xi'an

IMG_19390Previously, I posted a comment on a paper by Alex Shestopaloff and Radford Neal, after my visit to Toronto two years ago, using a particular version of ensemble Monte Carlo. A new paper by the same authors was recently arXived, as an refinement of the embedded HMM paper of Neal (2003), in that the authors propose a new and more efficient way to generate from the (artificial) embedded hidden Markov sampler that is central to their technique of propagating a set of pool states. The method exploits both forward and backward representations of HMMs in an alternating manner. And propagates the pool states from one observation time to the next. The paper also exploits latent Gaussian structures to make autoregressive proposals, as well as flip proposals from x to -x [which seem to only make sense when 0 is a central value for the target, i.e. when the observables y only depend on |x|]. All those modifications bring the proposal quite close to (backward) particle Gibbs, the difference being in using Metropolis rather than importance steps. And in an improvement brought by the embedded HMM approach, even though it is always delicate to generalise those comparisons when some amount of calibration is required by both algorithms under comparison. (Especially delicate when it is rather remote from my area of expertise!) Anyway, I am still intrigued [in a positive way] by the embedded HMM idea as it remains mysterious that a finite length HMM simulation can improve the convergence performances that much. And wonder at a potential connection with an earlier paper of Anthony Lee and Krys Latuszynski using a random number of auxiliary variables. Presumably a wrong impression from a superficial memory…

Measuring statistical evidence using relative belief [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , on July 22, 2015 by xi'an

“It is necessary to be vigilant to ensure that attempts to be mathematically general do not lead us to introduce absurdities into discussions of inference.” (p.8)

This new book by Michael Evans (Toronto) summarises his views on statistical evidence (expanded in a large number of papers), which are a quite unique mix of Bayesian  principles and less-Bayesian methodologies. I am quite glad I could receive a version of the book before it was published by CRC Press, thanks to Rob Carver (and Keith O’Rourke for warning me about it). [Warning: this is a rather long review and post, so readers may chose to opt out now!]

“The Bayes factor does not behave appropriately as a measure of belief, but it does behave appropriately as a measure of evidence.” (p.87)

Continue reading

MCMC for non-linear state space models using ensembles of latent sequences

Posted in Statistics with tags , , , , on November 6, 2013 by xi'an

IMG_19390While visiting U of T, I had the opportunity to discuss the above paper MCMC for non-linear state space models using ensembles of latent sequences with both authors, Alex Shestopalo ff and Radford Neal, paper that I had completely missed during my hospital break. The paper borrows from the central idea of Neal (2003), which “is to temporarily reduce the state space of the model, which may be countably in finite or continuous, to a finite collection of randomly generated `pool’ states at each time point.” Several copies of the hidden Markov chain are generated from a proposal and weighted according to the posterior. This makes for a finite state space on which a forward-backward algorithm can be run, thus producing a latent sequence that is more likely than the ones originally produced from the proposal, as it borrows at different times from different chains.  (I alas had no lasting memory of this early paper of Radford’s. But this is essentially ensemble Monte Carlo.)

What Radford patiently explained to me in Toronto was why this method did not have the same drawback as an importance sampling method, as the weights were local rather than global and hence did not degenerate as the length of the chain/HMM increased. Which I find a pretty good argument! The trick of being able to rely on forward-backward simulation is also very appealing. This of course does not mean that the method is always converging quickly, as the proposal matters. A novelty of the current paper is the inclusion of parameter simulation steps as well, steps that are part of the ensemble Monte Carlo process (rather than a standard Gibbs implementation). There also is a delayed acceptance (as opposed to delayed rejection) step where a subset of the chain is used to check for early (and cheaper) rejection.

The paper is a bit short on describing the way pool states can be generated (see Section 7), but it seems local (time-wise) perturbations of the current state are considered. I wonder if an intermediate particle step could produce a more efficient proposal… It also seems possible to consider a different number of pool states at more uncertain or more sticky times, with a potential for adaptivity depending on the acceptance rate.

For the evaluation of the method, the authors consider the Ricker population dynamics model found in Wood (2003) and Fearnhead and Prangle (2012), where semi-automated ABC is used. In the experiment described therein, there are only 100 latent states, which is enough to hinder MCMC. The ensemble method does much better. While there is no comparison with ABC, I would presume this method, relying on a more precise knowledge of the probabilistic model, should do better. Maybe a good topic for a Master project?

Philosophenweg (im Toronto)

Posted in pictures, Running, Travel with tags , , , , , , , on November 2, 2013 by xi'an

IMG_1996Apart from walking the philosopher’s way (or path) in Toronto, after Heidelberg and Kyoto, I had a very enjoyable time (if not weather) for my very first time visit to the University of Toronto, from discussions with several friends of old, to a well-attended seminar, to a grilled octopus lunch in an faculty club and a (French) dinner au Paradis, to seeing a variety of costumes parading the streets for Halloween, with different varieties at six p.m. and two a.m., and to a discovery run in the early and windy Friday morning in the streets of Toronto.

seminars at CMU and University of Toronto

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , on October 29, 2013 by xi'an

IMG_1864Here are the slides for my seminar talks at Carnegie Mellon University (Pittsburgh) and the University of Toronto, tomorrow and the day after, respectively:


Viva in Toronto (not really!)

Posted in Statistics, Travel, University life with tags , , , on July 22, 2011 by xi'an

This was the second viva of the week, for the thesis of Madeleine Thompson, but as it was in Toronto, I took part in it by a phone connection. This was rather ineffective as the connection was rather poor and I could not follow most of the questions… I had previously read (and commented) two papers,  Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method, and Graphical Comparison of MCMC Performance,  co-written by Madeleine so I was well-aware of a part of the contents of the thesis, which I read in toto a few weeks ago. It was an interesting thesis with diversified threads in the various chapter, but I found frustrating to be unable to fully take part in the thesis debate… In retrospect, I should have flown to Toronto from Manchester yesterday or abstained from taking part in the viva!