## one bridge further

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , on June 30, 2020 by xi'an

Jackie Wong, Jon Forster (Warwick) and Peter Smith have just published a paper in Statistics & Computing on bridge sampling bias and improvement by splitting.

“… known to be asymptotically unbiased, bridge sampling technique produces biased estimates in practical usage for small to moderate sample sizes (…) the estimator yields positive bias that worsens with increasing distance between the two distributions. The second type of bias arises when the approximation density is determined from the posterior samples using the method of moments, resulting in a systematic underestimation of the normalizing constant.”

Recall that bridge sampling is based on a double trick with two samples x and y from two (unnormalised) densities f and g that are interverted in a ratio

$m \sum_{i=1}^n g(x_i)\omega(x_i) \Big/ n \sum_{i=1}^m f(y_i)\omega(y_i)$

of unbiased estimators of the inverse normalising constants. Hence biased. The more the less similar these two densities are. Special cases for ω include importance sampling [unbiased] and reciprocal importance sampling. Since the optimal version of the bridge weight ω is the inverse of the mixture of f and g, it makes me wonder at the performance of using both samples top and bottom, since as an aggregated sample, they also come from the mixture, as in Owen & Zhou (2000) multiple importance sampler. However, a quick try with a positive Normal versus an Exponential with rate 2 does not show an improvement in using both samples top and bottom (even when using the perfectly normalised versions)

morc=(sum(f(y)/(nx*dnorm(y)+ny*dexp(y,2)))+
sum(f(x)/(nx*dnorm(x)+ny*dexp(x,2))))/(
sum(g(x)/(nx*dnorm(x)+ny*dexp(x,2)))+
sum(g(y)/(nx*dnorm(y)+ny*dexp(y,2))))


at least in terms of bias… Surprisingly (!) the bias almost vanishes for very different samples sizes either in favour of f or in favour of g. This may be a form of genuine defensive sampling, who knows?! At the very least, this ensures a finite variance for all weights. (The splitting approach introduced in the paper is a natural solution to create independence between the first sample and the second density. This reminded me of our two parallel chains in AMIS.)

## snapshots of Oxford Statistics

Posted in Kids, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , on February 29, 2016 by xi'an

Following the opening of the new Department of Statistics building in Oxford [which somewhat ironically is the former Department of Mathematics!], a professional photographer was commissioned for a photo cover of this move. Which is incidentally fantastic for the cohesion and work quality of the department, when compared with the former configuration in two disconnected buildings on South Parks Road. Not mentioning the vis-à-vis with Eagle and Child.

As the photographer happened to be there the very day I was teaching my Bayesian module for the OxWaSP PhD students, I ended up in some of the photographs (with no clear memory of this photographer, who was most unintrusive). With my Racoon River Brewing Co. tee-shirt I brought back from Des Moines. And was wearing in a very indirect allusion to the US primaries the night before!

## importance sampling with multiple MCMC sequences

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , on October 2, 2015 by xi'an

Vivek Roy, Aixian Tan and James Flegal arXived a new paper, Estimating standard errors for importance sampling estimators with multiple Markov chains, where they obtain a central limit theorem and hence standard error estimates when using several MCMC chains to simulate from a mixture distribution as an importance sampling function. Just before I boarded my plane from Amsterdam to Calgary, which gave me the opportunity to read it completely (along with half a dozen other papers, since it is a long flight!) I first thought it was connecting to our AMIS algorithm (on which convergence Vivek spent a few frustrating weeks when he visited me at the end of his PhD), because of the mixture structure. This is actually altogether different, in that a mixture is made of unnormalised complex enough densities, to act as an importance sampler, and that, due to this complexity, the components can only be simulated via separate MCMC algorithms. Behind this characterisation lurks the challenging problem of estimating multiple normalising constants. The paper adopts the resolution by reverse logistic regression advocated in Charlie Geyer’s famous 1994 unpublished technical report. Beside the technical difficulties in establishing a CLT in this convoluted setup, the notion of mixing importance sampling and different Markov chains is quite appealing, especially in the domain of “tall” data and of splitting the likelihood in several or even many bits, since the mixture contains most of the information provided by the true posterior and can be corrected by an importance sampling step. In this very setting, I also think more adaptive schemes could be found to determine (estimate?!) the optimal weights of the mixture components.

## Evolution, Oregon wine

Posted in Travel, Wines with tags , , , on January 20, 2013 by xi'an

## Rachel’s #1 sunrise in Des Moines

Posted in Kids, pictures, Running, Travel with tags , , , on November 4, 2012 by xi'an