## Archive for statistical modelling

## a summer of British conferences!

Posted in pictures, Statistics, Travel, University life with tags BAYSM 2018, Britain, conference, Edinburgh, England, ISBA 2018, iwsm2018, statistical modelling, University of Bristol, Warwick on January 18, 2018 by xi'an## model misspecification in ABC

Posted in Statistics with tags ABC, all models are wrong, Australia, likelihood-free methods, Melbourne, Mission Beach, model mispecification, Monash University, statistical modelling on August 21, 2017 by xi'an**W**ith David Frazier and Judith Rousseau, we just arXived a paper studying the impact of a misspecified model on the outcome of an ABC run. This is a question that naturally arises when using ABC, but that has been not directly covered in the literature apart from a recently arXived paper by James Ridgway [that was earlier this month commented on the ‘Og]. On the one hand, ABC can be seen as a robust method in that it focus on the aspects of the assumed model that are translated by the [insufficient] summary statistics and their expectation. And nothing else. It is thus tolerant of departures from the hypothetical model that [almost] preserve those moments. On the other hand, ABC involves a degree of non-parametric estimation of the intractable likelihood, which may sound even more robust, except that the likelihood is estimated from pseudo-data simulated from the “wrong” model in case of misspecification.

In the paper, we examine how the pseudo-true value of the parameter [that is, the value of the parameter of the misspecified model that comes closest to the generating model in terms of Kullback-Leibler divergence] is asymptotically reached by some ABC algorithms like the ABC accept/reject approach and not by others like the popular linear regression [post-simulation] adjustment. Which suprisingly concentrates posterior mass on a completely different pseudo-true value. Exploiting our recent assessment of ABC convergence for well-specified models, we show the above convergence result for a tolerance sequence that decreases to the minimum possible distance [between the true expectation and the misspecified expectation] at a slow enough rate. Or that the sequence of acceptance probabilities goes to zero at the proper speed. In the case of the regression correction, the pseudo-true value is shifted by a quantity that does not converge to zero, because of the misspecification in the expectation of the summary statistics. This is not immensely surprising but we hence get a very different picture when compared with the well-specified case, when regression corrections bring improvement to the asymptotic behaviour of the ABC estimators. This discrepancy between two versions of ABC can be exploited to seek misspecification diagnoses, e.g. through the acceptance rate versus the tolerance level, or via a comparison of the ABC approximations to the posterior expectations of quantities of interest which should diverge at rate Vn. In both cases, ABC reference tables/learning bases can be exploited to draw and calibrate a comparison with the well-specified case.

## beyond objectivity, subjectivity, and other ‘bjectivities

Posted in Statistics with tags Andrew Gelman, Christian Hennig, discussion paper, Errol Street, frequentist inference, London, objectivism, Read paper, Royal Statistical Society, RSS, Series A, statistical modelling, subjective versus objective Bayes, subjectivity on April 12, 2017 by xi'an**H**ere is my discussion of Gelman and Hennig at the Royal Statistical Society, which I am about to deliver!

## Statistical rethinking [book review]

Posted in Books, Kids, R, Statistics, University life with tags Amazon, Bayes theorem, Bayesian data analysis, Bayesian Essentials with R, book review, CHANCE, code, convergence diagnostics, E.T. Jaynes, generalised linear models, golem, maths, matrix algebra, MCMC algorithms, mixtures of distributions, Monte Carlo Statistical Methods, Prague, R, robots, STAN, statistical modelling, Statistical rethinking on April 6, 2016 by xi'anStatistical Rethinking: A Bayesian Course with Examples in R and Stan is a new book by Richard McElreath that CRC Press sent me for review in CHANCE. While the book was already discussed on Andrew’s blog three months ago, and [rightly so!] enthusiastically recommended by Rasmus Bååth on Amazon, here are the reasons why I am quite impressed by Statistical Rethinking!

“Make no mistake: you will wreck Prague eventually.” (p.10)

While the book has a lot in common with Bayesian Data Analysis, from being in the same CRC series to adopting a pragmatic and weakly informative approach to Bayesian analysis, to supporting the use of STAN, it also nicely develops its own ecosystem and idiosyncrasies, with a noticeable Jaynesian bent. To start with, I like the highly personal style with clear attempts to make the concepts memorable for students by resorting to external concepts. The best example is the call to the myth of the golem in the first chapter, which McElreath uses as an warning for the use of statistical models (which almost are anagrams to golems!). Golems and models [and robots, another concept invented in Prague!] are man-made devices that strive to accomplish the goal set to them without heeding the consequences of their actions. This first chapter of Statistical Rethinking is setting the ground for the rest of the book and gets quite philosophical (albeit in a readable way!) as a result. In particular, there is a most coherent call against hypothesis testing, which by itself justifies the title of the book. Continue reading

## interesting mis-quote

Posted in Books, pictures, Statistics, Travel, University life with tags Alan Turing, all models are wrong, artificial intelligence, George Box, misquote, Peter Norvig, statistical modelling, The End of Theory, Thomas Bayes on September 25, 2014 by xi'an**A**t a recent conference on Big Data, one speaker mentioned this quote from Peter Norvig, the director of research at Google:

“All models are wrong, and increasingly you can succeed without them.”

quote that I found rather shocking, esp. when considering the amount of modelling behind Google tools. And coming from someone citing Kernel Methods for Pattern Analysis by Shawe-Taylor and Christianini as one of his favourite books and Bayesian Data Analysis as another one… Or displaying Bayes [or his alleged portrait] and Turing in his book cover. So I went searching on the Web for more information about this surprising quote. And found the explanation, as given by Peter Norvig himself:

“To set the record straight: That’s a silly statement, I didn’t say it, and I disagree with it.”

Which means that weird quotes have a high probability of being misquotes. And used by others to (obviously) support their own agenda. In the current case, Chris Anderson and his End of Theory paradigm. Briefly and mildly discussed by Andrew a few years ago.