## making a random walk geometrically ergodic

Posted in R, Statistics with tags , , , , , , , , , on March 2, 2013 by xi'an

While a random walk Metropolis-Hastings algorithm cannot be uniformly ergodic in a general setting (Mengersen and Tweedie, AoS, 1996), because it needs more energy to leave far away starting points, it can be geometrically ergodic depending on the target (and the proposal). In a recent Annals of Statistics paper, Leif Johnson and Charlie Geyer designed a trick to turn a random walk Metropolis-Hastings algorithm into a geometrically ergodic random walk Metropolis-Hastings algorithm by virtue of an isotropic transform (under the provision that the original target density has a moment generating function). This theoretical result is complemented by an R package called mcmc. (I have not tested it so far, having read the paper in the métro.) The examples included in the paper are however fairly academic and I wonder how the method performs in practice, on truly complex models, in particular because the change of variables relies on (a) an origin and (b) changing the curvature of space uniformly in all dimensions. Nonetheless, the idea is attractive and reminds me of a project of ours with Randal Douc,  started thanks to the ‘Og and still under completion.

## Random dive MH

Posted in R, Statistics with tags , , , , on September 2, 2010 by xi'an

A new Metropolis-Hastings algorithm that I would call “universal” was posted by Somak Dutta yesterday on arXiv. Multiplicative random walk Metropolis-Hastings on the real line contains a different Metropolis-Hastings algorithm called the random dive. The proposed new value x’ given the current value x is defined by

$\begin{cases} x\epsilon &\text{with probability } 1/2\\ x/\epsilon &\text{with probability } 1/2\end{cases}$

when $\epsilon$ is a random variable on $(-1,1)$. Thus, at each iteration, the current value is either shrunk or expanded by a random amount. When I read the paper in the Paris metro, while appreciating that this algorithm could be geometrically ergodic as opposed to the classical random walk algorithm, I was not convinced by the practical impact of the method, as I thought that the special role of zero in the proposal was not always reflected in the target. Especially when considering that the proposal is parameter-free. However, after running the following R program on a target centred away from zero, I found the proposal quite efficient.

f=function(x){.5*dnorm(x,mean=14)+.5*dnorm(x,mean=35)}
Nsim=10^5
x=rep(5,Nsim)
for (t in 2:Nsim){
coef=runif(1,min=-1)^sample(c(-1,1),1)
prop=x[t-1]*coef
prob=abs(coef)*f(prop)/f(x[t-1])
x[t]=x[t-1]
if (runif(1)<prob) x[t]=prop
}
hist(x,pro=T,nclass=113,col=”wheat2″)

Obviously, it is difficult to believe that this extension will keep working similarly well when the dimension increases but this is an interesting way of creating a heavy tail proposal.

## A repulsive random walk

Posted in R, Statistics with tags , , , , on May 28, 2010 by xi'an

Matt Asher posted an R experiment on R-bloggers yesterday simulating the random walk

$x_{t+1} = x_t + \varepsilon_t / x_t$

which has the property of avoiding zero by quickly switching to a large value as soon as $x_t$ is small. He was then wondering about the “convergence” of the random walk given that it moves very little once $x_t$ is large enough. The values he found for various horizons t seemed to indicate a stable regime.

I reran the same experiment as Matt in a Monte Carlo perspective, using the R program

resu=matrix(0,ncol=100,nrow=25)
sampl=rnorm(100)
for (i in 1:25){
for (t in 2^(i-1):2^i) sampl=sampl+rnorm(100)/sampl
resu[i,]=sampl
}
boxplot(as.data.frame(t(abs(resu))),name=as.character(1:25),col="wheat3")

The outcome of this R code plotted above shows that the range and the average of the 100 replications is increasing with t. This behaviour indicates a transient behaviour of the Markov chain, which almost surely goes to infinity and never comes back (because at infinity the variance is zero). Another indication for transience is shown by the fact that $x_t$ comes back to the interval (-1,1) with probability $\Phi(-|x_t|)$, a probability which goes to zero with $x_t$. As suggested to me by Randal Douc, this transience can be established rigorously by considering

$x_{t+1}^2 = x_t^2 + 2\epsilon_t + \epsilon_t^2/x_t^2 > x_t^2 + 2\epsilon_t>2\sum_{i=1}^t \epsilon_t$

which is thus bounded from below by a null recurrent process, which almost surely goes to infinity. Therefore the above Markov chain cannot have a stationary distribution or even a stationary measure: it almost surely goes to (plus or minus) infinity.

## Random mowing

Posted in Kids, Travel with tags , , , on April 27, 2010 by xi'an

My brother-in-law Christophe has a mowing robot (called Mara!) that is completely autonomous. It starts from its plug and goes along straight lines until it hits an obstacle or the boundaries defined by an underground cable. Then it turns at random and starts again. The concept is quite interesting but I am surprised that it operates on a purely local random walk principle, without any learning about the topology of the terrain. This means that zones with more obstacles are visited less often. (To compensate for that, the mower starts moving in circles each time it hits denser regions of grass.) And that the mower spends a lot of time looking for its plug. An algorithmic determination of the terrain would be possible, obviously, using for instance a Metropolis-Hastings scheme.

## t-walk on the banana side

Posted in R, Statistics with tags , , , , on March 15, 2010 by xi'an

Following my remarks on the t-walk algorithm in the recent A General Purpose Sampling Algorithm for Continuous Distributions, published by Christen and Fox in Bayesian Analysis that acts like a general purpose MCMC algorithm, Darren Wraith tested it on the generic (10 dimension) banana target we used in the cosmology paper. Here is an output from his comparison R program:

The use of the t-walk algorithm (left, with the same number of particles) in this very special case thus produces a wider variability  on the estimated means than the use of adaptive MCMC (center) and our (tuned) PMC algorithm (right).