A slice of infinity

Peng Yu sent me an email about the conditions for convergence of a Gibbs sampler:

The following statement mentions convergence. But I’m not familiar what the regularity condition is.

“But it is necessary to have a finite probability of moving away from the current state at all times in order to satisfy the regularity conditions on which the whole MCMC theory depends.”

Slice sampler is discussed in your book Monte Carlo Statistical Methods. I think that the “regularity condition” may have been discussed in your book. If so, would you please let me know where it is? Thanks and look forward to hearing from you!

The quote is from Martyn Plummer and deals with a stopping rule in JAGS implementation of the slice sampler. (The correct wording should be “strictly positive probability” rather than “finite probability”, I think.) However, this has nothing to do with a “regularity condition” on the irreducibility of a Markov chain: if a slice sampler is implemented for an unbounded density target, say a Beta(1/2,1/2), there is no irreducibility condition connected with the infiniteness of the density. In theory, (a) the chain never visits the “state” where the density is infinite (if only because we are dealing with a continuous state space) and (b) after visiting a value x with a large density f(x), the slice sampler allows for a move away from it since the slice involves a uniform simulation over (0,f(x)). Deeper properties of the slice sampler (like geometric ergodicity) are explored in, e.g., this JRSS B paper by Gareth Roberts and Jeff Rosenthal and this one in the Annals of Statistics by Radford Neal. In practice, the problem is caused by values of f(x) that cannot be computed and hence produce an error message like

Singularity in likelihood found by Slicer.

If those singularities can be localised, a neighbourhood excluding them should be introduced. (More easily said than done, obviously!)

Here is an example of a slice sampler with the Beta(1/2,1/2) distribution:

#graphics
dote=function(x,y) points(x,y,col="gold",pch=19,cex=.4)
mote=function(x,y,z,w) lines(c(x,z),c(y,w),col="gold",lwd=.5)
cst=dbeta(.5,.5,.5)*.5 #normalising constant
#inverting f(x)=d, 2nd degree equation
hitden=function(d) .5+.5*sqrt(1-4*( cst/ max(d,dbeta(.5,.5,.5)))^2)*c(-1,1)
#output
curve(dbeta(x,.5,.5),0,1,ylab="density",lwd=2,col="steelblue",n=1001)
x=runif(1);u=runif(1)*dbeta(x,.5,.5);dote(x,u)
for (t in 1:100){ #100 slice steps
   bo=hitden(u)
   nx=sample(c(runif(1,0,bo[1]),runif(1,bo[2],1)),1)
   nu=runif(1)*dbeta(nx,.5,.5)
   mote(x,u,nx,nu)
   x=nx;u=nu;dote(x,u)
   }

which clearly explores the whole area under the Beta(1/2,1/2) density. Even when started at a large density value like f(.999999), it eventually leaves the vicinity of this highly improbable value.

One Response to “A slice of infinity”

  1. Oh dear. If my comments on the JAGS help forum are going to be publicly peer reviewed then I really am in trouble. Of course I meant that there has to be a non-zero (rather than finite) probability of moving away from the current state.

    Let’s be clear. This is a numerical problem caused by overflow in the log-density calculations. JAGS is perfectly capable of slice sampling from the beta(0.5, 0.5) distribution, even when starting from an extreme value. The problem seems to arise when using a beta or Dirichlet distribution in a hierarchical model. In some cases, the sum of the contributions to the log-density seems to overflow, yielding an apparent infinite density.

    So what can we do in such cases? Earlier versions of JAGS just stayed put. Sometimes the chain can recover at the next iteration, because the values of the other parameters have changed. This is the case for the “classic bugs” LITTERS example. But there could be models in which the slice sampler gets permanently stuck. The user would know nothing about it unless they were monitoring that node and the output would potentially be invalid.

    Truncating beta distributions away from the boundary seems to work, but is clearly a sticking plaster solution. In the above-mentioned LITTERS example, I use

    p[i,j] ~ dbeta(a[i], b[i]) T(,0.9999)

    More constructive suggestions are welcome because this problem is a lot more common than I would like.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 598 other followers