Diffusive nested sampling

A paper on a new version of nested sampling, written by three physicists, has been posted on arXiv last week. Its purpose is the same as MultiNest, namely to overcome exploration difficulties due to multimodality. The core idea is to allow the likelihood climb that characterises nested sampling to backtrack by allowing for decreases in the likelihood at each iteration. This method works as a natural tempering, defining levels of likelihood exceedances, each within a theoretical fraction e^{-1} of the previous level coverage, so that the nested sampling sequence may leave isolated but attractive modes. It can also be seen as a special case of stratified sampling, where the strata are defined in terms of the likelihood levels. However, since the move from one level j to another k seems to be defined in terms of the respective numbers of visits to both levels, n(j) and n(k), from the start of the algorithm, I wonder how this non-Markovian feature impacts the validity of the resulting Metropolis-Hastings move. (Wang and Landau, 2001, is mentioned, but this does not seem enough of an argument.) Furthermore, since the strata are estimated during the simulation, the additional estimation of the probability of each stratum may jeopardise the variance reduction offered by a traditional statification (see MCSM, Exercise 4.22). Overall, this is an interesting paper, in that it straightens out the connection of nested sampling with importance sampling (see my post-MaxEnt09 post), but the fact that it completely relies on an MCMC scheme makes me wonder (as for classical nested sampling) why the comparison with other importance or MCMC approximations of the evidence \mathfrak{Z} is not conducted.

2 Responses to “Diffusive nested sampling”

  1. Oops, accidentally pressed submit too early!

    I guess you could use our mixture as an importance sampler if you wanted. The Diffusive Nested Sampling procedure constructs the sampler for you though, which is the hard part.

    “but the fact that it completely relies on an MCMC scheme makes me wonder (as for classical nested sampling) why the comparison with other importance or MCMC approximations of the evidence \mathfrak{Z} is not conducted.”

    The reason for relying on an MCMC scheme is that it’s easy and very general. There are methods that will work more efficiently on specific problems, I’m 100% sure of that – but what I need in my areas of study are methods that can solve pretty tough problems without needing much tailoring. As long as you can come up with Metropolis-Hastings proposals, you can use Diffusive Nested Sampling. I like that. :-)

  2. Hi Christian,

    Thanks for blogging about our work! I have a few comments. Firstly, the paper has been through a few revisions since then (soon there’ll be a v3 on arxiv as it might be accepted for publication).

    “I wonder how this non-Markovian feature impacts the validity of the resulting Metropolis-Hastings move.”

    Our non-Markovian feature decreases in strength towards zero, so the eventual convergence is fine. It satisfies the “diminishing adaptation condition”; see Rosenthal, J. S. Optimal proposal distributions and adaptive MCMC, 2010. This is intuitively quite obvious – imagine running Metropolis on some target distribution and jiggling the target distribution randomly each step. If the jiggling eventually goes to ~zero, then everything is fine.

    “it straightens out the connection of nested sampling with importance sampling”

    I guess you could use our mixture as an importance sampler if you wanted.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.