Archive for University of Toronto

distributions for parameters [seminar]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 22, 2018 by xi'an
Next Thursday, January 25, Nancy Reid will give a seminar in Paris-Dauphine on distributions for parameters that covers different statistical paradigms and bring a new light on the foundations of statistics. (Coffee is at 10am in the Maths department common room and the talk is at 10:15 in room A, second floor.)

Nancy Reid is University Professor of Statistical Sciences and the Canada Research Chair in Statistical Theory and Applications at the University of Toronto and internationally acclaimed statistician, as well as a 2014 Fellow of the Royal Society of Canada. In 2015, she received the Order of Canada, was elected a foreign associate of the National Academy of Sciences in 2016 and has been awarded many other prestigious statistical and science honours, including the Committee of Presidents of Statistical Societies (COPSS) Award in 1992.

Nancy Reid’s research focuses on finding more accurate and efficient methods to deduce and conclude facts from complex data sets to ultimately help scientists find specific solutions to specific problems.

There is currently some renewed interest in developing distributions for parameters, often without relying on prior probability measures. Several approaches have been proposed and discussed in the literature and in a series of “Bayes, fiducial, and frequentist” workshops and meeting sessions. Confidence distributions, generalized fiducial inference, inferential models, belief functions, are some of the terms associated with these approaches.  I will survey some of this work, with particular emphasis on common elements and calibration properties. I will try to situate the discussion in the context of the current explosion of interest in big data and data science. 

likelihood inflating sampling algorithm

Posted in Books, Statistics, University life with tags , , , , , , , , on May 24, 2016 by xi'an

My friends from Toronto Radu Craiu and Jeff Rosenthal have arXived a paper along with Reihaneh Entezari on MCMC scaling for large datasets, in the spirit of Scott et al.’s (2013) consensus Monte Carlo. They devised an likelihood inflated algorithm that brings a novel perspective to the problem of large datasets. This question relates to earlier approaches like consensus Monte Carlo, but also kernel and Weierstrass subsampling, already discussed on this blog, as well as current research I am conducting with my PhD student Changye Wu. The approach by Entezari et al. is somewhat similar to consensus Monte Carlo and the other solutions in that they consider an inflated (i.e., one taken to the right power) likelihood based on a subsample, with the full sample being recovered by importance sampling. Somewhat unsurprisingly this approach leads to a less dispersed estimator than consensus Monte Carlo (Theorem 1). And the paper only draws a comparison with that sub-sampling method, rather than covering other approaches to the problem, maybe because this is the most natural connection, one approach being the k-th power of the other approach.

“…we will show that [importance sampling] is unnecessary in many instances…” (p.6)

An obvious question that stems from the approach is the call for importance sampling, since the numerator of the importance sampler involves the full likelihood which is unavailable in most instances when sub-sampled MCMC is required. I may have missed the part of the paper where the above statement is discussed, but the only realistic example discussed therein is the Bayesian regression tree (BART) of Chipman et al. (1998). Which indeed constitutes a challenging if one-dimensional example, but also one that requires delicate tuning that leads to cancelling importance weights but which may prove delicate to extrapolate to other models.

Sampling latent states for high-dimensional non-linear state space models with the embedded HMM method

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on March 17, 2016 by xi'an

IMG_19390Previously, I posted a comment on a paper by Alex Shestopaloff and Radford Neal, after my visit to Toronto two years ago, using a particular version of ensemble Monte Carlo. A new paper by the same authors was recently arXived, as an refinement of the embedded HMM paper of Neal (2003), in that the authors propose a new and more efficient way to generate from the (artificial) embedded hidden Markov sampler that is central to their technique of propagating a set of pool states. The method exploits both forward and backward representations of HMMs in an alternating manner. And propagates the pool states from one observation time to the next. The paper also exploits latent Gaussian structures to make autoregressive proposals, as well as flip proposals from x to -x [which seem to only make sense when 0 is a central value for the target, i.e. when the observables y only depend on |x|]. All those modifications bring the proposal quite close to (backward) particle Gibbs, the difference being in using Metropolis rather than importance steps. And in an improvement brought by the embedded HMM approach, even though it is always delicate to generalise those comparisons when some amount of calibration is required by both algorithms under comparison. (Especially delicate when it is rather remote from my area of expertise!) Anyway, I am still intrigued [in a positive way] by the embedded HMM idea as it remains mysterious that a finite length HMM simulation can improve the convergence performances that much. And wonder at a potential connection with an earlier paper of Anthony Lee and Krys Latuszynski using a random number of auxiliary variables. Presumably a wrong impression from a superficial memory…

Measuring statistical evidence using relative belief [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , on July 22, 2015 by xi'an

“It is necessary to be vigilant to ensure that attempts to be mathematically general do not lead us to introduce absurdities into discussions of inference.” (p.8)

This new book by Michael Evans (Toronto) summarises his views on statistical evidence (expanded in a large number of papers), which are a quite unique mix of Bayesian  principles and less-Bayesian methodologies. I am quite glad I could receive a version of the book before it was published by CRC Press, thanks to Rob Carver (and Keith O’Rourke for warning me about it). [Warning: this is a rather long review and post, so readers may chose to opt out now!]

“The Bayes factor does not behave appropriately as a measure of belief, but it does behave appropriately as a measure of evidence.” (p.87)

Continue reading

MCMC for non-linear state space models using ensembles of latent sequences

Posted in Statistics with tags , , , , on November 6, 2013 by xi'an

IMG_19390While visiting U of T, I had the opportunity to discuss the above paper MCMC for non-linear state space models using ensembles of latent sequences with both authors, Alex Shestopalo ff and Radford Neal, paper that I had completely missed during my hospital break. The paper borrows from the central idea of Neal (2003), which “is to temporarily reduce the state space of the model, which may be countably in finite or continuous, to a finite collection of randomly generated `pool’ states at each time point.” Several copies of the hidden Markov chain are generated from a proposal and weighted according to the posterior. This makes for a finite state space on which a forward-backward algorithm can be run, thus producing a latent sequence that is more likely than the ones originally produced from the proposal, as it borrows at different times from different chains.  (I alas had no lasting memory of this early paper of Radford’s. But this is essentially ensemble Monte Carlo.)

What Radford patiently explained to me in Toronto was why this method did not have the same drawback as an importance sampling method, as the weights were local rather than global and hence did not degenerate as the length of the chain/HMM increased. Which I find a pretty good argument! The trick of being able to rely on forward-backward simulation is also very appealing. This of course does not mean that the method is always converging quickly, as the proposal matters. A novelty of the current paper is the inclusion of parameter simulation steps as well, steps that are part of the ensemble Monte Carlo process (rather than a standard Gibbs implementation). There also is a delayed acceptance (as opposed to delayed rejection) step where a subset of the chain is used to check for early (and cheaper) rejection.

The paper is a bit short on describing the way pool states can be generated (see Section 7), but it seems local (time-wise) perturbations of the current state are considered. I wonder if an intermediate particle step could produce a more efficient proposal… It also seems possible to consider a different number of pool states at more uncertain or more sticky times, with a potential for adaptivity depending on the acceptance rate.

For the evaluation of the method, the authors consider the Ricker population dynamics model found in Wood (2003) and Fearnhead and Prangle (2012), where semi-automated ABC is used. In the experiment described therein, there are only 100 latent states, which is enough to hinder MCMC. The ensemble method does much better. While there is no comparison with ABC, I would presume this method, relying on a more precise knowledge of the probabilistic model, should do better. Maybe a good topic for a Master project?

Philosophenweg (im Toronto)

Posted in pictures, Running, Travel with tags , , , , , , , on November 2, 2013 by xi'an

IMG_1996Apart from walking the philosopher’s way (or path) in Toronto, after Heidelberg and Kyoto, I had a very enjoyable time (if not weather) for my very first time visit to the University of Toronto, from discussions with several friends of old, to a well-attended seminar, to a grilled octopus lunch in an faculty club and a (French) dinner au Paradis, to seeing a variety of costumes parading the streets for Halloween, with different varieties at six p.m. and two a.m., and to a discovery run in the early and windy Friday morning in the streets of Toronto.

seminars at CMU and University of Toronto

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , on October 29, 2013 by xi'an

IMG_1864Here are the slides for my seminar talks at Carnegie Mellon University (Pittsburgh) and the University of Toronto, tomorrow and the day after, respectively: