## A slice of infinity

**P**eng Yu sent me an email about the conditions for convergence of a Gibbs sampler:

The following statement mentions convergence. But I’m not familiar what the regularity condition is.

“But it is necessary to have a finite probability of moving away from the current state at all times in order to satisfy the regularity conditions on which the whole MCMC theory depends.”

Slice sampler is discussed in your book

. I think that the “regularity condition” may have been discussed in your book. If so, would you please let me know where it is? Thanks and look forward to hearing from you!Monte Carlo Statistical Methods

**T**he quote is from Martyn Plummer and deals with a stopping rule in ** JAGS** implementation of the slice sampler. (The correct wording should be “strictly positive probability” rather than “finite probability”, I think.) However, this has nothing to do with a “regularity condition” on the irreducibility of a Markov chain: if a slice sampler is implemented for an unbounded density target, say a

*Beta(1/2,1/2)*, there is no irreducibility condition connected with the infiniteness of the density. In theory, (a) the chain never visits the “state” where the density is infinite (if only because we are dealing with a continuous state space) and (b) after visiting a value

*x*with a large density

*f(x)*, the slice sampler allows for a move away from it since the slice involves a uniform simulation over

*(0,f(x))*. Deeper properties of the slice sampler (like geometric ergodicity) are explored in, e.g., this JRSS B paper by Gareth Roberts and Jeff Rosenthal and this one in the Annals of Statistics by Radford Neal. In practice, the problem is caused by values of

*f(x)*that cannot be computed and hence produce an error message like

Singularity in likelihood found by Slicer.

If those singularities can be localised, a neighbourhood excluding them should be introduced. (More easily said than done, obviously!)

Here is an example of a slice sampler with the *Beta(1/2,1/2)* distribution:

#graphics dote=function(x,y) points(x,y,col="gold",pch=19,cex=.4) mote=function(x,y,z,w) lines(c(x,z),c(y,w),col="gold",lwd=.5) cst=dbeta(.5,.5,.5)*.5 #normalising constant #inverting f(x)=d, 2nd degree equation hitden=function(d) .5+.5*sqrt(1-4*( cst/ max(d,dbeta(.5,.5,.5)))^2)*c(-1,1) #output curve(dbeta(x,.5,.5),0,1,ylab="density",lwd=2,col="steelblue",n=1001) x=runif(1);u=runif(1)*dbeta(x,.5,.5);dote(x,u) for (t in 1:100){ #100 slice steps bo=hitden(u) nx=sample(c(runif(1,0,bo[1]),runif(1,bo[2],1)),1) nu=runif(1)*dbeta(nx,.5,.5) mote(x,u,nx,nu) x=nx;u=nu;dote(x,u) }

which clearly explores the whole area under the *Beta(1/2,1/2)* density. Even when started at a large density value like *f(.999999)*, it eventually leaves the vicinity of this highly improbable value.

July 28, 2011 at 11:22 am

Oh dear. If my comments on the JAGS help forum are going to be publicly peer reviewed then I really am in trouble. Of course I meant that there has to be a non-zero (rather than finite) probability of moving away from the current state.

Let’s be clear. This is a numerical problem caused by overflow in the log-density calculations. JAGS is perfectly capable of slice sampling from the beta(0.5, 0.5) distribution, even when starting from an extreme value. The problem seems to arise when using a beta or Dirichlet distribution in a hierarchical model. In some cases, the sum of the contributions to the log-density seems to overflow, yielding an apparent infinite density.

So what can we do in such cases? Earlier versions of JAGS just stayed put. Sometimes the chain can recover at the next iteration, because the values of the other parameters have changed. This is the case for the “classic bugs” LITTERS example. But there could be models in which the slice sampler gets permanently stuck. The user would know nothing about it unless they were monitoring that node and the output would potentially be invalid.

Truncating beta distributions away from the boundary seems to work, but is clearly a sticking plaster solution. In the above-mentioned LITTERS example, I use

p[i,j] ~ dbeta(a[i], b[i]) T(,0.9999)

More constructive suggestions are welcome because this problem is a lot more common than I would like.