## dynamic nested sampling for stars

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , , , , , , , on April 12, 2019 by xi'an

In the sequel of earlier nested sampling packages, like MultiNest, Joshua Speagle has written a new package called dynesty that manages dynamic nested sampling, primarily intended for astronomical applications. Which is the field where nested sampling is the most popular. One of the first remarks in the paper is that nested sampling can be more easily implemented by using a Uniform reparameterisation of the prior, that is, a reparameterisation that turns the prior into a Uniform over the unit hypercube. Which means in fine that the prior distribution can be generated from a fixed vector of uniforms and known transforms. Maybe not such an issue given that this is the prior after all.  The author considers this makes sampling under the likelihood constraint a much simpler problem but it all depends in the end on the concentration of the likelihood within the unit hypercube. And on the ability to reach the higher likelihood slices. I did not see any special trick when looking at the documentation, but reflected on the fundamental connection between nested sampling and this ability. As in the original proposal by John Skilling (2006), the slice volumes are “estimated” by simulated Beta order statistics, with no connection with the actual sequence of simulation or the problem at hand. We did point out our incomprehension for such a scheme in our Biometrika paper with Nicolas Chopin. As in earlier versions, the algorithm attempts at visualising the slices by different bounding techniques, before proceeding to explore the bounded regions by several exploration algorithms, including HMC.

“As with any sampling method, we strongly advocate that Nested Sampling should not be viewed as being strictly“better” or “worse” than MCMC, but rather as a tool that can be more or less useful in certain problems. There is no “One True Method to Rule Them All”, even though it can be tempting to look for one.”

When introducing the dynamic version, the author lists three drawbacks for the static (original) version. One is the reliance on this transform of a Uniform vector over an hypercube. Another one is that the overall runtime is highly sensitive to the choice the prior. (If simulating from the prior rather than an importance function, as suggested in our paper.) A third one is the issue that nested sampling is impervious to the final goal, evidence approximation versus posterior simulation, i.e., uses a constant rate of prior integration. The dynamic version simply modifies the number of point simulated in each slice. According to the (relative) increase in evidence provided by the current slice, estimated through iterations. This makes nested sampling a sort of inversted Wang-Landau since it sharpens the difference between slices. (The dynamic aspects for estimating the volumes of the slices and the stopping rule may hinder convergence in unclear ways, which is not discussed by the paper.) Among the many examples produced in the paper, a 200 dimension Normal target, which is an interesting object for posterior simulation in that most of the posterior mass rests on a ring away from the maximum of the likelihood. But does not seem to merit a mention in the discussion. Another example of heterogeneous regression favourably compares dynesty with MCMC in terms of ESS (but fails to include an HMC version).

[Breaking News: Although I wrote this post before the exciting first image of the black hole in M87 was made public and hence before I was aware of it, the associated AJL paper points out relying on dynesty for comparing several physical models of the phenomenon by nested sampling.]

## MCMSki [day 2]

Posted in Mountains, pictures, Statistics, University life with tags , , , , , , , , , on January 8, 2014 by xi'an

I was still feeling poorly this morning with my brain in a kind of flu-induced haze so could not concentrate for a whole talk, which is a shame as I missed most of the contents of the astrostatistics session put together by David van Dyk… Especially the talk by Roberto Trotta I was definitely looking for. And the defence of nested sampling strategies for marginal likelihood approximations. Even though I spotted posterior distributions for WMAP and Plank data on the ΛCDM that reminded me of our own work in this area… Apologies thus to all speakers for dozing in and out, it was certainly not due to a lack of interest!

Sebastian Seehars mentioned emcee (for ensemble Monte Carlo), with a corresponding software nicknamed “the MCMC hammer”, and their own CosmoHammer software. I read the paper by Goodman and Ware (2010) this afternoon during the ski break (if not on a ski lift!). Actually, I do not understand why an MCMC should be affine invariant: a good adaptive MCMC sampler should anyway catch up the right scale of the target distribution. Other than that, the ensemble sampler reminds me very much of the pinball sampler we developed with Kerrie Mengersen (1995 Valencia meeting), where the target is the product of L targets,

$\pi(x_1)\cdots\pi(x_L)$

and a Gibbs-like sampler can be constructed, moving one component (with index k, say) of the L-sample at a time. (Just as in the pinball sampler.) Rather than avoiding all other components (as in the pinball sampler), Goodman and Ware draw a single other component at random  (with index j, say) and make a proposal away from it:

$\eta=x_j(t) + \zeta \{x_k(t)-x_j(t)\}$

where ζ is a scale random variable with (log-) symmetry around 1. The authors claim improvement over a single track Metropolis algorithm, but it of course depends on the type of Metropolis algorithms that is chosen… Overall, I think the criticism of the pinball sampler also applies here: using a product of targets can only slow down the convergence. Further, the affine structure of the target support is not a given. Highly constrained settings should not cope well with linear transforms and non-linear reparameterisations would be more efficient….