Andrew Fowlie, Will Handley and Liangliang Su have recently arXived a new paper on checking the convergence of nested sampling by a uniformity test. The argument goes as follows: if the draw from the prior under the likelihood restriction (at the core of the nested sampling principle) is correctly generated, the rank of the realised value of the associated likelihood should be uniformly distributed among the remaining likelihoods. Obviously, the opposite does not hold: a perfectly uniform distribution can happen even when the sampler misses a particularly well-hidden mode of the target disstribution or when it systematically stops too early, using for instance a misspecified bound on the likelihood. One particular setting when uniformity fails is when the likelihood surface plateaus in a particular region of the parameter space. (As a French speaker, writing plateaus makes me cringe since the plural of plateau is plateaux! Pardon my French!) When reaching the plateau the algorithm starts accumulating at the limiting value (or else completely ignores the plateau and its prior mass). I actually wonder if the existence of plateaux is not a sufficient reason for invalidating nested sampling, at least in its original version, since it assumes a continuous distribution on the likelihood values… If no plateau comes to hinder the algorithm, the rank test could be used to calibrate the exploration algorithm as for instance in the determination of the number of MCMC steps, running in parallel T random walks until the rank test across these runs turns green. The authors of the paper suggest using a Kolmogorov-Smirnov test, which strikes me as not the most appropriate solution, given the discrete nature of the theoretical distribution and the existence of uniformity tests in the pseudo random generation literature. At a conceptual level, I am also wondering at the sequential use of the test (as opposed to a parallel version at each iteration) since the target distribution is changing at every step (and so does the approximate method used to reproduce the prior simulation under the likelihood restriction).
Archive for uniformity test
nested sampling X check
Posted in Books, Mountains, pictures, Statistics with tags French, nested sampling, pardon my French!, plateau, pseudo-random generators, rank test, uniformity test on September 18, 2020 by xi'ancertified randomness, 187m away…
Posted in Statistics with tags Bell inequality, Nature, quantum computers, random number generation, randomness, RNG, total variation, uniformity test on May 3, 2018 by xi'anAs it rarely happens with Nature, I just read an article that directly relates to my research interests, about a secure physical random number generator (RNG). By Peter Bierhost and co-authors, mostly physicists apparently. Security here means that the outcome of the RNG is unpredictable. This very peculiar RNG is based on two correlated photons sent to two measuring stations, separated by at least 187m, which have to display unpredictable outcomes in order to respect the impossibility of faster-than-light communications, otherwise known as Bell inequalities. This is hardly practical though, especially when mentioning that the authors managed to produce 2¹⁰ random bits over 10 minutes, post processing “the measurement of 55 million photon pairs”. (I however fail to see why the two-arm apparatus would be needed for regular random generation as it seems relevant solely for the demonstration of randomness.) I also checked the associated supplementary material, which is mostly about proving some total variation bound, and constructing a Bell function. What is most puzzling in this paper (and the associated supplementary material) is the (apparent) lack of guarantee of uniformity of the RNG. For instance, a sentence (Supplementary Material, p.11) about a distribution being “within TV distance of uniform” hints at the method being not provably uniform, which makes the whole exercise incomprehensible…
checking ABC convergence via coverage
Posted in pictures, Statistics, Travel, University life with tags ABC, arXiv, Bayesian calibration, confidence sets, credible intervals, DIYABC, p-values, uniformity test on January 24, 2013 by xi'anDennis Prangle, Michael Blum, G. Popovic and Scott Sisson just arXived a paper on diagnostics for ABC validation via coverage diagnostics. Getting valid approximation diagnostics for ABC is clearly and badly needed and this was the last slide of my talk yesterday at the Winter Workshop in Gainesville. When simulation time is not an issue (!), our DIYABC software does implement a limited coverage assessment by computing the type I error, i.e. by simulating data under the null model and evaluating the number of time it is rejected at the 5% level (see sections 2.11.3 and 3.8 in the documentation). The current paper builds on a similar perspective.
The idea in the paper is that a (Bayesian) credible interval at a given credible level α should have a similar confidence level (at least asymptotically and even more for matching priors) and that simulating pseudo-data with a known parameter value allows for a Monte-Carlo evaluation of the credible interval “true” coverage, hence for a calibration of the tolerance. The delicate issue is about the generation of those “known” parameters. For instance, if the pair (θ, y) is generated from the joint distribution prior x likelihood, and if the credible region is also based on the true posterior, the average coverage is the nominal one. On the other hand, if the credible interval is based on a poor (ABC) approximation to the posterior, the average coverage should differ from the nominal one. Given that ABC is always wrong, however, this may fail to be a powerful diagnostic. In particular, when using insufficient (summary) statistics, the discrepancy should make testing for uniformity harder, shouldn’t it? Continue reading