Archive for astrostatistics

dynamic nested sampling for stars

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , , , , , , , on April 12, 2019 by xi'an

In the sequel of earlier nested sampling packages, like MultiNest, Joshua Speagle has written a new package called dynesty that manages dynamic nested sampling, primarily intended for astronomical applications. Which is the field where nested sampling is the most popular. One of the first remarks in the paper is that nested sampling can be more easily implemented by using a Uniform reparameterisation of the prior, that is, a reparameterisation that turns the prior into a Uniform over the unit hypercube. Which means in fine that the prior distribution can be generated from a fixed vector of uniforms and known transforms. Maybe not such an issue given that this is the prior after all.  The author considers this makes sampling under the likelihood constraint a much simpler problem but it all depends in the end on the concentration of the likelihood within the unit hypercube. And on the ability to reach the higher likelihood slices. I did not see any special trick when looking at the documentation, but reflected on the fundamental connection between nested sampling and this ability. As in the original proposal by John Skilling (2006), the slice volumes are “estimated” by simulated Beta order statistics, with no connection with the actual sequence of simulation or the problem at hand. We did point out our incomprehension for such a scheme in our Biometrika paper with Nicolas Chopin. As in earlier versions, the algorithm attempts at visualising the slices by different bounding techniques, before proceeding to explore the bounded regions by several exploration algorithms, including HMC.

“As with any sampling method, we strongly advocate that Nested Sampling should not be viewed as being strictly“better” or “worse” than MCMC, but rather as a tool that can be more or less useful in certain problems. There is no “One True Method to Rule Them All”, even though it can be tempting to look for one.”

When introducing the dynamic version, the author lists three drawbacks for the static (original) version. One is the reliance on this transform of a Uniform vector over an hypercube. Another one is that the overall runtime is highly sensitive to the choice the prior. (If simulating from the prior rather than an importance function, as suggested in our paper.) A third one is the issue that nested sampling is impervious to the final goal, evidence approximation versus posterior simulation, i.e., uses a constant rate of prior integration. The dynamic version simply modifies the number of point simulated in each slice. According to the (relative) increase in evidence provided by the current slice, estimated through iterations. This makes nested sampling a sort of inversted Wang-Landau since it sharpens the difference between slices. (The dynamic aspects for estimating the volumes of the slices and the stopping rule may hinder convergence in unclear ways, which is not discussed by the paper.) Among the many examples produced in the paper, a 200 dimension Normal target, which is an interesting object for posterior simulation in that most of the posterior mass rests on a ring away from the maximum of the likelihood. But does not seem to merit a mention in the discussion. Another example of heterogeneous regression favourably compares dynesty with MCMC in terms of ESS (but fails to include an HMC version).

[Breaking News: Although I wrote this post before the exciting first image of the black hole in M87 was made public and hence before I was aware of it, the associated AJL paper points out relying on dynesty for comparing several physical models of the phenomenon by nested sampling.]


a book and two chapters on mixtures

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 8, 2019 by xi'an

The Handbook of Mixture Analysis is now out! After a few years of planning, contacts, meetings, discussions about notations, interactions with authors, further interactions with late authors, repeating editing towards homogenisation, and a final professional edit last summer, this collection of nineteen chapters involved thirty-five contributors. I am grateful to all participants to this piece of work, especially to Sylvia Früwirth-Schnatter for being a driving force in the project and for achieving a much higher degree of homogeneity in the book than I expected. I would also like to thank Rob Calver and Lara Spieker of CRC Press for their boundless patience through the many missed deadlines and their overall support.

Two chapters which I co-authored are now available as arXived documents:

5. Gilles Celeux, Kaniav Kamary, Gertraud Malsiner-Walli, Jean-Michel Marin, and Christian P. Robert, Computational Solutions for Bayesian Inference in Mixture Models
7. Gilles Celeux, Sylvia Früwirth-Schnatter, and Christian P. Robert, Model Selection for Mixture Models – Perspectives and Strategies

along other chapters

1. Peter Green, Introduction to Finite Mixtures
8. Bettina Grün, Model-based Clustering
12. Isobel Claire Gormley and Sylvia Früwirth-Schnatter, Mixtures of Experts Models
13. Sylvia Kaufmann, Hidden Markov Models in Time Series, with Applications in Economics
14. Elisabeth Gassiat, Mixtures of Nonparametric Components and Hidden Markov Models
19. Michael A. Kuhn and Eric D. Feigelson, Applications in Astronomy

improperties on an astronomical scale

Posted in Books, pictures, Statistics with tags , , , , , , , on December 15, 2017 by xi'an

As pointed out by Peter Coles on his blog, In the Dark, Hyungsuk Tak, Sujit Ghosh, and Justin Ellis just arXived a review of the unsafe use of improper priors in astronomy papers, 24 out of 75 having failed to establish that the corresponding posteriors are well-defined. And they exhibit such an instance (of impropriety) in a MNRAS paper by Pihajoki (2017), which is a complexification of Gelfand et al. (1990), also used by Jim Hobert in his thesis. (Even though the formal argument used to show the impropriety of the posterior in Pihajoki’s paper does not sound right since it considers divergence at a single value of a parameter β.) Besides repeating this warning about an issue that was rather quickly identified in the infancy of MCMC, if not in the very first publications on the Gibbs sampler, the paper seems to argue against using improper priors due to this potential danger, stating that instead proper priors that include all likely values and beyond are to be preferred. Which reminds me of the BUGS feature of using a N(0,10⁹) prior instead of the flat prior, missing the fact that “very large” variances do impact the resulting inference (if only for the issue of model comparison, remember Lindley-Jeffreys!). And are informative in that sense. However, it is obviously a good idea to advise checking for propriety (!) and using such alternatives may come as a safety button, providing a comparison benchmark to spot possible divergences in the resulting inference.

[Astrostat summer school] fogrise [jatp]

Posted in Kids, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , on October 11, 2017 by xi'an

[Astrostat summer school] sunrise [jatp]

Posted in Statistics with tags , , , , , , , , , , , on October 10, 2017 by xi'an

[summer Astrostat school] room with a view [jatp]

Posted in Mountains, pictures, R, Running, Statistics, Travel, University life with tags , , , , , , , , , , on October 9, 2017 by xi'an

I just arrived in Autrans, on the Plateau du Vercors overlooking Grenoble and the view is fabulistic! Trees have started to turn red and yellow, the weather is very mild, and my duties are restricted to teaching ABC to a group of enthusiastic astronomers and cosmologists..! Second advanced course on ABC in the mountains this year, hard to beat (except by a third course). The surroundings are so serene and peaceful that I even conceded to install RStudio for my course! Instead of sticking to my favourite vim editor and line commands.

Bayesian methods in cosmology

Posted in Statistics with tags , , , , , , , , , , , , on January 18, 2017 by xi'an

A rather massive document was arXived a few days ago by Roberto Trotta on Bayesian methods for cosmology, in conjunction with an earlier winter school, the 44th Saas Fee Advanced Course on Astronomy and Astrophysics, “Cosmology with wide-field surveys”. While I never had the opportunity to give a winter school in Saas Fee, I will give next month a course on ABC to statistics graduates in another Swiss dream location, Les Diablerets.  And next Fall a course on ABC again but to astronomers and cosmologists, in Autrans, near Grenoble.

The course document is an 80 pages introduction to probability and statistics, in particular Bayesian inference and Bayesian model choice. Including exercises and references. As such, it is rather standard in that the material could be found as well in textbooks. Statistics textbooks.

When introducing the Bayesian perspective, Roberto Trotta advances several arguments in favour of this approach. The first one is that it is generally easier to follow a Bayesian approach when compared with seeking a non-Bayesian one, while recovering long-term properties. (Although there are inconsistent Bayesian settings.) The second one is that a Bayesian modelling allows to handle naturally nuisance parameters, because there are essentially no nuisance parameters. (Even though preventing small world modelling may lead to difficulties as in the Robbins-Wasserman paradox.) The following two reasons are the incorporation of prior information and the appeal on conditioning on the actual data.

trottaThe document also includes this above and nice illustration of the concentration of measure as the dimension of the parameter increases. (Although one should not over-interpret it. The concentration does not occur in the same way for a normal distribution for instance.) It further spends quite some space on the Bayes factor, its scaling as a natural Occam’s razor,  and the comparison with p-values, before (unsurprisingly) introducing nested sampling. And the Savage-Dickey ratio. The conclusion of this model choice section proposes some open problems, with a rather unorthodox—in the Bayesian sense—line on the justification of priors and the notion of a “correct” prior (yeech!), plus an musing about adopting a loss function, with which I quite agree.