Vanilla Rao-Blackwellisation updated

The vanilla Rao-Blackwellisation paper with Randal Douc has been updated, arXived, and resubmitted to the Annals, to include a more detailed study of the variance improvement brought by the Rao-Blackwellisation. In particular, we considered a probit modelling of the Pima Indian diabetes study, which is a standard benchmark that I also used in other papers and in the book Introducing Monte Carlo Methods with R with George Casella,. We also included the reference to the paper by Malefaki and Iliopoulos about Lemma 1, discussed in a previous post. The assessment of the variance improvement is examined in terms of the empirical variances of the individual terms

h(z_j) \sum_{t=0}^\infty \prod_{m=1}^{t-1}\{1- \alpha(z_j,y_{jm})\}=h(z_j) \xi_j


h(z_j) \sum_{t=0}^\infty \prod_{m=1}^{t-1}\mathbb{I}\{u_{jm}\ge \alpha(z_j,y_{jm})\}=h(z_j) \mathfrak{n}_j

involved in the two versions of the estimators, as explained in the original post (with a corrected typo due to the missing product in both terms!) and the variance ratio may be as small as two for some examples, including the Pima Indian diabetes. The improvement in the variances of \xi_j against \mathfrak{n}_j is most clearly seen in the boxplot below, in a exponential toy example where the acceptance rate is 1/13:Boxplot of the variances of coefficients

A phenomenon about boxplots that I cannot truly explain is that while the variance of the Rao-Blackwellised estimate of \mathbb{E}[X] is 80% of the variance of the original estimator, the improvement is not that visible on the boxplot below of three samples of estimates (the middle one is our Rao-Blackwellised candidate, whose name vanished for being too large):Boxplot of three estimates of E[X]

Now a 20% decrease in the variance means a mere 10% decrease in the standard deviation so this could agree with the difference in the upper bars of the boxplot… Since this is an example where the acceptance probability can be computed as well, I also added the optimal importance sampling estimator on the above boxplot and this confirms the impression that, while the variance decrease may be significant, this is not so clearly visible on the above boxplot.

We also took advantage of the fact that the quantity

\alpha(z_j,y_{j0}) \sum_{t=0}^\infty \prod_{m=1}^{t-1}\{1- \alpha(z_j,y_{jm})\}

where y_{j0} is an additional simulation from the proposal q(z_j,y) is a universal control variate for Metropolis-Hastings algorithms since its expectation is one. This means that in practice we can compute the regression coefficient \hat\beta of h(z_j)\xi_j on \alpha(z_j,y_{j0})\xi_j and replace the average of the h(z_j)\xi_j‘s with the controled version

\frac{1}{N} \sum_j h(z_j)\xi_j - \hat\beta(\alpha(z_j,y_{j0})\xi_j-1)

with, again, improvement in the empirical variances that can reach a factor of 2.

One Response to “Vanilla Rao-Blackwellisation updated”

  1. […] the weights, the overall effect may be difficult to spot in practical applications. In the current version of the paper, we manage 20% reduction in the variance of those weights, but obviously this does not […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.