Archive for conditional density

recycling Gibbs auxiliaries

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on December 6, 2016 by xi'an

wreck of the S.S. Dicky, Caloundra beach, Qld, Australia, Aug. 19, 2012Luca Martino, Victor Elvira and Gustau Camps-Valls have arXived a paper on recycling for Gibbs sampling. The argument therein is to take advantage of all simulations induced by MCMC simulation for one full conditional, towards improving estimation if not convergence. The context is thus one when Metropolis-within-Gibbs operates, with several (M) iterations of the corresponding Metropolis being run instead of only one (which is still valid from a theoretical perspective). While there are arguments in augmenting those iterations, as recalled in the paper, I am not a big fan of running a fixed number of M of iterations as this does not approximate better the simulation from the exact full conditional and even if this approximation was perfect, the goal remains simulating from the joint distribution. As such, multiplying the number of Metropolis iterations does not necessarily impact the convergence rate, only brings it closer to the standard Gibbs rate. Moreover, the improvement does varies with the chosen component, meaning that the different full conditionals have different characteristics that produce various levels of variance reduction:

  • if the targeted expectation only depends on one component of the Markov chain, multiplying the number of simulations for the other components has no clear impact, except in increasing time;
  • if the corresponding full conditional is very concentrated, repeating simulations should produce quasi-repetitions, and no gain.

The only advantage in computing time that I can see at this stage is when constructing the MCMC sampler for the full proposal is much more costly than repeating MCMC iterations, which are then almost free and contribute to the reduction of the variance of the estimator.

This analysis of MCMC-withing-Gibbs strategies reminds me of a recent X validated question, which was about the proper degree of splitting simulations from a marginal and from a corresponding conditional in the chain rule, the optimal balance being in my opinion dependent on the relative variances of the conditional expectations.

A last point is that recycling in the context of simulation and Monte Carlo methodology makes me immediately think of Rao-Blackwellisation, which is surprisingly absent from the current paperRao-Blackwellisation was introduced in the MCMC literature and to the MCMC community in the first papers of Alan Gelfand and Adrian Smith, in 1990. While this is not always producing a major gain in Monte Carlo variability, it remains a generic way of recycling auxiliary variables as shown, e.g., in the recycling paper we wrote with George Casella in 1996, one of my favourite papers.

conditional sampling

Posted in R, Statistics with tags , , , , on September 5, 2016 by xi'an

An interesting question about stratified sampling came up on X validated last week, namely how to optimise a Monte Carlo estimate based on two subsequent simulations, one, X, from a marginal and one or several Y from the corresponding conditional given X, when the costs of producing those two simulations significantly differ. When looking at the variance of the Monte Carlo estimate, this variance can be minimised in the number of simulations from the marginal under a computing budget. However, when the costs of producing x, y given or (x,y) are about the same, it does not pay to replicate simulations from y given x or x given y, because this increases the randomness of the estimator and induces positive correlation between some terms in the sum. Assuming that the conditional variances are always computable, we could envision an extension to the question where for each new value of a simulated x (or y), a variable number of conditionally simulated y (or x) could be produced. Or even when those conditional variances are unknown but can be replaced with empirical versions.

The illustration comes from a bivariate normal model with correlation, for a rather arbitrary function , but the pattern remains the same, namely that iid simulations of the pair (X,Y invariably leads to a smaller variance of the estimator compared with simulation with a 1:10 (10 Y’s for one X) or 10:1 ratio between x’s and y’s. Depending on the function and the relative variances, the 1:10 or 10:1 schemes may have a similar variability.


targ=function(x,y){ log(x^2*y^4)+x^2*cos(x^2)/y*sin(y^2)}

for (i in 1:N){

MCMC on zero measure sets

Posted in R, Statistics with tags , , , , , , , on March 24, 2014 by xi'an

zeromesSimulating a bivariate normal under the constraint (or conditional to the fact) that x²-y²=1 (a non-linear zero measure curve in the 2-dimensional Euclidean space) is not that easy: if running a random walk along that curve (by running a random walk on y and deducing x as x²=y²+1 and accepting with a Metropolis-Hastings ratio based on the bivariate normal density), the outcome differs from the target predicted by a change of variable and the proper derivation of the conditional. The above graph resulting from the R code below illustrates the discrepancy!


for (t in 2:T){
  if (ace){

If instead we add the proper Jacobian as in


the fit is there. My open question is how to make this derivation generic, i.e. without requiring the (dreaded) computation of the (dreadful) Jacobian.


%d bloggers like this: