Archive for Series A

handbook of mixture analysis [review]

Posted in Books, R, Statistics with tags , , , , , , , , , on March 19, 2021 by xi'an

“In my opinion, the editors have done an excellent job when selecting the contents of the handbook and putting the different chapters together. For instance, this can be appreciated by the fact that, despite the large number of authors and contributions, all chapters have kept the same notation. Furthermore, in addition to a sound description of the underlying theory and methods, several chapters include information about how to fit the presented models using the R programming language. However, I missed pointers to repositories to download the code and datasets for some of the examples used in the book. To sum up, this is an excellent reference book on mixture models.” Virgilio Gómez-Rubio, JRSS A, 2021

visual effects

Posted in Books, pictures, Statistics with tags , , , , , , , , , , , on November 2, 2018 by xi'an

As advertised and re-discussed by Dan Simpson on the Statistical Modeling, &tc. blog he shares with Andrew and a few others, the paper Visualization in Bayesian workflow he wrote with Jonah Gabry, Aki Vehtari, Michael Betancourt and Andrew Gelman was one of three discussed at the RSS conference in Cardiff, last week month, as a Read Paper for Series A. I had stored the paper when it came out towards reading and discussing it, but as often this good intention led to no concrete ending. [Except concrete as in concrete shoes…] Hence a few notes rather than a discussion in Series B A.

Exploratory data analysis goes beyond just plotting the data, which should sound reasonable to all modeling readers.

Fake data [not fake news!] can be almost [more!] as valuable as real data for building your model, oh yes!, this is the message I am always trying to convey to my first year students, when arguing about the connection between models and simulation, as well as a defense of ABC methods. And more globally of the very idea of statistical modelling. While indeed “Bayesian models with proper priors are generative models”, I am not particularly fan of using the prior predictive [or the evidence] to assess the prior as it may end up in a classification of more or less all but terrible priors, meaning that all give very little weight to neighbourhoods of high likelihood values. Still, in a discussion of a TAS paper by Seaman et al. on the role of prior, Kaniav Kamary and I produced prior assessments that were similar to the comparison illustrated in Figure 4. (And this makes me wondering which point we missed in this discussion, according to Dan.)  Unhappy am I with the weakly informative prior illustration (and concept) as the amount of fudging and calibrating to move from the immensely vague choice of N(0,100) to the fairly tight choice of N(0,1) or N(1,1) is not provided. The paper reads like these priors were the obvious and first choice of the authors. I completely agree with the warning that “the utility of the the prior predictive distribution to evaluate the model does not extend to utility in selecting between models”.

MCMC diagnostics, beyond trace plots, yes again, but this recommendation sounds a wee bit outdated. (As our 1998 reviewww!) Figure 5(b) links different parameters of the model with lines, which does not clearly relate to a better understanding of convergence. Figure 5(a) does not tell much either since the green (divergent) dots stand within the black dots, at least in the projected 2D plot (and how can one reach beyond 2D?) Feels like I need to rtfm..!

“Posterior predictive checks are vital for model evaluation”, to wit that I find Figure 6 much more to my liking and closer to my practice. There could have been a reference to Ratmann et al. for ABC where graphical measures of discrepancy were used in conjunction with ABC output as direct tools for model assessment and comparison. Essentially predicting a zero error with the ABC posterior predictive. And of course “posterior predictive checking makes use of the data twice, once for the fitting and once for the checking.” Which means one should either resort to loo solutions (as mentioned in the paper) or call for calibration of the double-use by re-simulating pseudo-datasets from the posterior predictive. I find the suggestion that “it is a good idea to choose statistics that are orthogonal to the model parameters” somewhat antiquated, in that this sounds like rephrasing the primeval call to ancillary statistics for model assessment (Kiefer, 1975), while pretty hard to implement in modern complex models.

beyond objectivity, subjectivity, and other ‘bjectivities

Posted in Statistics with tags , , , , , , , , , , , , , on April 12, 2017 by xi'an

Here is my discussion of Gelman and Hennig at the Royal Statistical Society, which I am about to deliver!

objective and subjective RSS Read Paper next week

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on April 5, 2017 by xi'an

Andrew Gelman and Christian Hennig will give a Read Paper presentation next Wednesday, April 12, 5pm, at the Royal Statistical Society, London, on their paper “Beyond subjective and objective in statistics“. Which I hope to attend and else to write a discussion. Since the discussion (to published in Series A) is open to everyone, I strongly encourage ‘Og’s readers to take a look at the paper and the “radical” views therein to hopefully contribute to this discussion. Either as a written discussion or as comments on this very post.