Archive for Og

unbiased consistent nested sampling via sequential Monte Carlo [a reply]

Posted in pictures, Statistics, Travel with tags , , , , , , , , on June 13, 2018 by xi'an

Rob Salomone sent me the following reply on my comments of yesterday about their recently arXived paper.

Our main goal in the paper was to show that Nested Sampling (when interpreted a certain way) is really just a member of a larger class of SMC algorithms, and exploring the consequences of that. We should point out that the section regarding calibration applies generally to SMC samplers, and hope that people give those techniques a try regardless of their chosen SMC approach.
Regarding your question about “whether or not it makes more sense to get completely SMC and forego any nested sampling flavour!”, this is an interesting point. After all, if Nested Sampling is just a special form of SMC, why not just use more standard SMC approaches? It seems that the Nested Sampling’s main advantage is its ability to cope with problems that have “phase transition’’ like behaviour, and thus is robust to a wider range of difficult problems than annealing approaches. Nevertheless, we hope this way of looking at NS (and showing that there may be variations of SMC with certain advantages) leads to improved NS and SMC methods down the line.  
Regarding your post, I should clarify a point regarding unbiasedness. The largest likelihood bound is actually set to infinity. Thus, for the fixed version of NS—SMC, one has an unbiased estimator of the “final” band. Choosing a final band prematurely will of course result in very high variance. However, the estimator is unbiased. For example, consider NS—SMC with only one strata. Then, the method reduces to simply using the prior as an importance sampling distribution for the posterior (unbiased, but often high variance).
Comments related to two specific parts of your post are below (your comments in italicised bold):
“Which never occurred as the number one difficulty there, as the simplest implementation runs a Markov chain from the last removed entry, independently from the remaining entries. Even stationarity is not an issue since I believe that the first occurrence within the level set is distributed from the constrained prior.”
This is an interesting point that we had not considered! In practice, and in many papers that apply Nested Sampling with MCMC, the common approach is to start the MCMC at one of the randomly selected “live points”, so the discussion related to independence was in regard to these common implementations.
Regarding starting the chain from outside of the level set. This is likely not done in practice as it introduces an additional difficulty of needing to propose a sample inside the required region (Metropolis–Hastings will have non—zero probability of returning a sample that is still outside the constrained region for any fixed number of iterations). Forcing the continuation of MCMC until a valid point is proposed I believe will be a subtle violation of detailed balance. Of course, the bias of such a modification may be small in practice, but it is an additional awkwardness introduced by the requirement of sample independence!
“And then, in a twist that is not clearly explained in the paper, the focus moves to an improved nested sampler that moves one likelihood value at a time, with a particle step replacing a single  particle. (Things get complicated when several particles may take the very same likelihood value, but randomisation helps.) At this stage the algorithm is quite similar to the original nested sampler. Except for the unbiased estimation of the constants, the  final constant, and the replacement of exponential weights exp(-t/N) by powers of (N-1/N)”
Thanks for pointing out that this isn’t clear, we will try to do better in the next revision! The goal of this part of the paper wasn’t necessarily to propose a new version of nested sampling. Our focus here was to demonstrate that NS–SMC is not simply the Nested Sampling idea with an SMC twist, but that the original NS algorithm with MCMC (and restarting the MCMC sampling at one of the “live points’” as people do in practice) actually is a special case of SMC (with the weights replaced with a suboptimal choice).
The most curious thing is that, as you note, the estimates of remaining prior mass in the SMC context come out as powers of (N-1)/N and not exp(-t/N). In the paper by Walter (2017), he shows that the former choice is actually superior in terms of bias and variance. It was a nice touch that the superior choice of weights came out naturally in the SMC interpretation! 
That said, as the fixed version of NS-SMC is the one with the unbiasedness and consistency properties, this was the version we used in the main statistical examples.

unbiased consistent nested sampling via sequential Monte Carlo

Posted in pictures, Statistics, Travel with tags , , , , , , , , on June 12, 2018 by xi'an

“Moreover, estimates of the marginal likelihood are unbiased.” (p.2)

Rob Salomone, Leah South, Chris Drovandi and Dirk Kroese (from QUT and UQ, Brisbane) recently arXived a paper that frames the nested sampling in such a way that marginal likelihoods can be unbiasedly (and consistently) estimated.

“Why isn’t nested sampling more popular with statisticians?” (p.7)

A most interesting question, especially given its popularity in cosmology and other branches of physics. A first drawback pointed out in the c is the requirement of independence between the elements of the sample produced at each iteration. Which never occurred as the number one difficulty there, as the simplest implementation runs a Markov chain from the last removed entry, independently from the remaining entries. Even stationarity is not an issue since I believe that the first occurrence within the level set is distributed from the constrained prior.

A second difficulty is the use of quadrature which turns integrand into step functions at random slices. Indeed, mixing Monte Carlo with numerical integration makes life much harder, as shown by the early avatars of nested sampling that only accounted for the numerical errors. (And which caused Nicolas and I to write our critical paper in Biometrika.) There are few studies of that kind in the literature, the only one I can think of being [my former PhD student] Anne Philippe‘s thesis twenty years ago.

The third issue stands with the difficulty in parallelising the method. Except by jumping k points at once, rather than going one level at a time. While I agree this makes life more complicated, I am also unsure about the severity of that issue as k nested sampling algorithms can be run in parallel and aggregated in the end, from simple averaging to something more elaborate.

The final blemish is that the nested sampling estimator has a stopping mechanism that induces a truncation error, again maybe a lesser problem given the overall difficulty in assessing the total error.

The paper takes advantage of the ability of SMC to produce unbiased estimates of a sequence of normalising constants (or of the normalising constants of a sequence of targets). For nested sampling, the sequence is made of the prior distribution restricted to an embedded sequence of level sets. With another sequence restricted to bands (likelihood between two likelihood boundaries). If all restricted posteriors of the second kind and their normalising constant are known, the full posterior is known. Apparently up to the main normalising constant, i.e. the marginal likelihood., , except that it is also the sum of all normalising constants. Handling this sequence by SMC addresses the four concerns of the four authors, apart from the truncation issue, since the largest likelihood bound need be set for running the algorithm.

When the sequence of likelihood bounds is chosen based on the observed likelihoods so far, the method becomes adaptive. Requiring again the choice of a stopping rule that may induce bias if stopping occurs too early. And then, in a twist that is not clearly explained in the paper, the focus moves to an improved nested sampler that moves one likelihood value at a time, with a particle step replacing a single particle. (Things get complicated when several particles may take the very same likelihood value, but randomisation helps.) At this stage the algorithm is quite similar to the original nested sampler. Except for the unbiased estimation of the constants, the final constant, and the replacement of exponential weights exp(-t/N) by powers of (N-1/N).

The remainder of this long paper (61 pages!) is dedicated to practical implementation, calibration and running a series of comparisons. A nice final touch is the thanks to the ‘Og for its series of posts on nested sampling, which “helped influence this work, and played a large part in inspiring it.”

In conclusion, this paper is certainly a worthy exploration of the nested sampler, providing further arguments towards a consistent version, with first and foremost an (almost?) unbiased resolution. The comparison with a wide range of alternatives remains open, in particular time-wise, if evidence is the sole target of the simulation. For instance, the choice of this sequence of targets in an SMC may be improved by another sequence, since changing one particle at a time does not sound efficient. The complexity of the implementation and in particular of the simulation from the prior under more and more stringent constraints need to be addressed.

surg’Og interest from Serbia

Posted in Statistics with tags , , , , on January 15, 2018 by xi'an

Amazonish warning

Posted in Books, Kids, R, Statistics with tags , , , , , on March 11, 2016 by xi'an

As in previous years, I want to repost a warning to ‘Og readers that all http links to Amazon.com [and much more rarely to Amazon.fr] products found on this ‘Og are actually susceptible to reward me with an advertising percentage if a purchase is made by the reader in the 24 hours following the entry on Amazon through this link, thanks to the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com/fr. Here are some of the most Og-unrelated purchases, less exotic than in previous years (maybe because of a lower number of items):

Once again, books I reviewed, positively or negatively, were among the top purchases… Like a dozen Monte Carlo simulation and resampling methods for social science, half a dozen Statistics done wrong, and Think Bayes: Bayesian Statistics Made Simple [which also attracted a lot of visits when I reviewed it], and still a few copies of Naked Statistics (despite a most critical review.) Thanks to all readers activating those links and feeding further my book addiction, with the drawback of inducing even more book reviews on the ‘Og…

no country for ‘Og snaps?!

Posted in Mountains, pictures, Travel with tags , , , , , , on August 30, 2015 by xi'an

A few days ago, I got an anonymous comment complaining about my tendency to post pictures “no one is interested in” on the ‘Og and suggesting I moved them to another electronic media like Twitter or Instagram as to avoid readers having to sort through the blog entries for statistics related ones, to separate the wheat from the chaff… While my first reaction was (unsurprisingly) one of irritation, a more constructive one is to point out to all (un)interested readers that they can always subscribe by RSS to the Statistics category (and skip the chaff), just like R bloggers only post my R related entries. Now, if more ‘Og’s readers find the presumably increasing flow of pictures a nuisance, just let me know and I will try to curb this avalanche of pixels… Not certain that I succeed, though!

amazonish thanks (& repeated warning)

Posted in Books, Kids, R, Statistics with tags , , , , , on December 9, 2014 by xi'an

As in previous years, at about this time, I want to (re)warn unaware ‘Og readers that all links to Amazon.com and more rarely to Amazon.fr found on this blog are actually susceptible to earn me an advertising percentage if a purchase is made by the reader in the 24 hours following the entry on Amazon through this link, thanks to the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com/fr. Unlike last year, I did not benefit as much from the new edition of Andrew’s book, and the link he copied from my blog entry… Here are some of the most Og-unrelated purchases:

Once again, books I reviewed, positively or negatively, were among the top purchases… Like a dozen Monte Carlo simulation and resampling methods for social science , a few copies of Naked Statistics. And again a few of The Cartoon Introduction to Statistics. (Despite a most critical review.) Thanks to all of you using those links and feeding further my book addiction, with the drawback of inducing even more fantasy book reviews.

amazonly associated thanks (& warnin’)

Posted in Books, Kids, R, Statistics with tags , , , , , , , on December 8, 2013 by xi'an

Following a now well-established pattern, let me (re)warn (the few) unwary ‘Og readers that the links to Amazon.com and to Amazon.fr found on this blog are actually susceptible to earn me a monetary gain [from 4% to 8% on the sales] if a purchase is made by the reader in the 24 hours following the entry on Amazon through this link, thanks to the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com/fr. Unlike the pattern of last year, and of the year before last, the mostly purchased item through the links happens to be related to a blog post, since it is Andrew’s book, with 318 copies of its third edition sold through the ‘Og last month! Here are some of the most exotic purchases:

As usual the books I actually reviewed along the past months, positively or negatively, were among the top purchases… Like two dozen copies of The BUGS book. And a dozen of R for dummies. And even a few of The Cartoon Introduction to Statistics. (Despite a most critical review.) Thanks to all of you using those links (for feeding further my book addiction, books that now eventually end up in the math common room in Dauphine or Warwick, once I have read them)!