**N**ext Fall, on 15-16 September, I will take part in a CRiSM workshop on hypothesis testing. In our department in Warwick. The registration is now open [until Sept 2] with a moderate registration free of £40 and a call for posters. Jim Berger and Joris Mulder will both deliver a plenary talk there, while Andrew Gelman will alas give a remote talk from New York. (A terrific poster by the way!)

## Archive for Bayesian statistics

## contemporary issues in hypothesis testing

Posted in pictures, Statistics, Travel, University life with tags Andrew Gelman, Bayes factors, Bayesian foundations, Bayesian statistics, Coventry, CRiSM, England, Fall, hypothesis testing, Jim Berger, Joris Mulder, statistical tests, University of Warwick, workshop on May 3, 2016 by xi'an## back from CIRM

Posted in Kids, Mountains, pictures, Running, Statistics, Travel, University life with tags Bayesian statistics, boar, calanques, Cassis, CIRM, Luminy, Marseille, Mont Puget, SMF, summer school, Université Aix Marseille, wikipedia on March 20, 2016 by xi'an**A**s should be clear from earlier posts, I tremendously enjoyed this past week at CIRM, Marseille, and not only for providing a handy retreat from where I could go running and climbing at least twice a day! The programme (with slides and films soon to be available on the CIRM website) was very well-designed with mini-courses and talks of appropriate length and frequency. Thanks to Nicolas Chopin (ENSAE ParisTech) and Gilles Celeux (Inria Paris) for constructing so efficiently this program and to the local organisers Thibaut Le Gouic (Ecole Centrale de Marseille), Denys Pommeret (Aix-Marseille Université), and Thomas Willer (Aix-Marseille Université) for handling the practical side of inviting and accommodating close to a hundred participants on this rather secluded campus. I hope we can reproduce the experiment a few years from now. Maybe in 2018 if we manage to squeeze it between BayesComp 2018 [ex-MCMski] and ISBA 2018 in Edinburgh.

One of the bonuses of staying at CIRM is indeed that it is fairly isolated and far from the fury of down-town Marseille, which may sound like a drag, but actually helps with concentration and interactions. Actually, the whole Aix-Marseille University campus of Luminy on which CIRM is located is surprisingly quiet: we were there in the very middle of the teaching semester and saw very few students around (although even fewer boars!). It is a bit of a mystery that a campus built in such a beautiful location with the Mont Puget as its background and the song of cicadas as the only source of “noise” is not better exploited towards attracting more researchers and students. However remoteness and lack of efficient public transportation may explain a lot about this low occupation of the campus. As may the poor quality of most buildings on the campus, which must be unbearable during the summer months…

In a potential planning for the future Bayesian week at CIRM, I think we could have some sort of poster sessions after-dinner (with maybe a cash bar operated by some of the invited students since there is no bar at CIRM or around). Or trail-running under moonlight, trying to avoid tripping over rummaging boars… A sort of Kaggle challenge would be nice but presumably too hard to organise. As a simpler joint activity, we could collectively contribute to some wikipedia pages related to Bayesian and computational statistics.

## at CIRM [#3]

Posted in Kids, Mountains, pictures, Running, Statistics, Travel, University life with tags ABC, ABC-SMC, Bayesian statistics, CIRM, component of a mixture, cross validated, expectation-propagation, high dimensions, identifiability, Luminy, Marseille, MCMC, Mont Puget, Monte Carlo Statistical Methods, particle filter, particle Gibbs sampler, summer school on March 4, 2016 by xi'an**S**imon Barthelmé gave his mini-course on EP, with loads of details on the implementation of the method. Focussing on the EP-ABC and MCMC-EP versions today. Leaving open the difficulty of assessing to which limit EP is converging. But mentioning the potential for asynchronous EP (on which I would like to hear more). Ironically using several times a logistic regression example, if not on the Pima Indians benchmark! He also talked about approximate EP solutions that relate to consensus MCMC. With a connection to Mark Beaumont’s talk at NIPS [at the time as mine!] on the comparison with ABC. While we saw several talks on EP during this week, I am still agnostic about the potential of the approach. It certainly produces a fast proxy to the true posterior and hence can be exploited *ad nauseam* in inference methods based on pseudo-models like indirect inference. In conjunction with other quick and dirty approximations when available. As in ABC, it would be most useful to know how far from the (ideal) posterior distribution does the approximation stands. Machine learning approaches presumably allow for an evaluation of the predictive performances, but less so for the modelling accuracy, even with new sampling steps. [But I know nothing, I know!]

Dennis Prangle presented some on-going research on high dimension [data] ABC. Raising the question of what is the true meaning of dimension in ABC algorithms. Or of sample size. Because the inference relies on the event d(s(y),s(y’))≤ξ or on the likelihood l(θ|x). Both one-dimensional. Mentioning Iain Murray’s talk at NIPS [that I also missed]. Re-expressing as well the perspective that ABC can be seen as a missing or estimated normalising constant problem as in Bornn et al. (2015) I discussed earlier. The central idea is to use SMC to simulate a particle cloud evolving as the target tolerance ξ decreases. Which supposes a latent variable structure lurking in the background.

Judith Rousseau gave her talk on non-parametric mixtures and the possibility to learn parametrically about the component weights. Starting with a rather “magic” result by Allman et al. (2009) that three repeated observations per individual, all terms in a mixture are identifiable. Maybe related to that simpler fact that mixtures of Bernoullis are not identifiable while mixtures of Binomial are identifiable, even when n=2. As “shown” in this plot made for X validated. Actually truly related because Allman et al. (2009) prove identifiability through a finite dimensional model. (I am surprised I missed this most interesting paper!) With the side condition that a mixture of p components made of r Bernoulli products is identifiable when p ≥ 2[log² r] +1, when log² is base 2-logarithm. And [x] the upper rounding. I also find most relevant this distinction between the weights and the remainder of the mixture as weights behave quite differently, hardly parameters in a sense.

## Puget sound [and sights]

Posted in Mountains, pictures, Running, Travel, University life with tags bastide, Bayesian statistics, calanques, CIRM, Luminy, Mont Puget, SMF, view on February 29, 2016 by xi'an## never mind the big data here’s the big models [workshop]

Posted in Kids, pictures, Statistics, Travel, University life with tags approximate likelihood, Bayesian model comparison, Bayesian statistics, big data, big models, GAMs, gaussian process, latent Gaussian models, likelihood function, misspecified model, model criticism, modelliing, point processes, Sex Pistols, spatial statistics, University of Warwick on December 22, 2015 by xi'an**M**aybe the last occurrence this year of the pastiche of the iconic LP of the Sex Pistols!, made by Tamara Polajnar. The last workshop as well of the big data year in Warwick, organised by the Warwick Data Science Institute. I appreciated the different talks this afternoon, but enjoyed particularly Dan Simpson’s and Rob Scheichl’s. The presentation by Dan was so hilarious that I could not resist asking him for permission to post the slides here:

Not only hilarious [and I have certainly missed 67% of the jokes], but quite deep about the meaning(s) of modelling and his views about getting around the most blatant issues. Ron presented a more computational talk on the ways to reach petaflops on current supercomputers, in connection with weather prediction models used (or soon to be used) by the Met office. For a prediction area of 1 km². Along with significant improvements resulting from multiscale Monte Carlo and quasi-Monte Carlo. Definitely impressive! And a brilliant conclusion to the Year of Big Data (and big models).

## Conditional love [guest post]

Posted in Books, Kids, Statistics, University life with tags Andrei Kolmogorov, axioms of probability, Bayes rule, Bayesian nonparametrics, Bayesian statistics, bootstrap, Bruno de Finetti, Céline Dion, David Draper, Dirichlet process, Edwin Jaynes, exchangeability, extendibility, information, JSM 2015, MCMC, plausibility, Richard Cox, Series B, Stone-Weierstrass, Theory of Probability on August 4, 2015 by xi'an*[When Dan Simpson told me he was reading Terenin’s and Draper’s latest arXival in a nice Bath pub—and not a nice bath tub!—, I asked him for a blog entry and he agreed. Here is his piece, read at your own risk! If you remember to skip the part about Céline Dion, you should enjoy it very much!!!]*

**P**robability has traditionally been described, as per Kolmogorov and his ardent follower Katy Perry, unconditionally. This is, of course, excellent for those of us who really like measure theory, as the maths is identical. Unfortunately mathematical convenience is not necessarily enough and a large part of the applied statistical community is working with Bayesian methods. These are unavoidably conditional and, as such, it is natural to ask if there is a fundamentally conditional basis for probability.

Bruno de Finetti—and later Richard Cox and Edwin Jaynes—considered conditional bases for Bayesian probability that are, unfortunately, incomplete. The critical problem is that they mainly consider finite state spaces and construct finitely additive systems of conditional probability. For a variety of reasons, neither of these restrictions hold much truck in the modern world of statistics.

In a recently arXiv’d paper, Alexander Terenin and David Draper devise a set of axioms that make the Cox-Jaynes system of conditional probability rigorous. Furthermore, they show that the complete set of Kolmogorov axioms (including countable additivity) can be derived as theorems from their axioms by conditioning on the entire sample space.

This is a deep and fundamental paper, which unfortunately means that I most probably do not grasp it’s complexities (especially as, for some reason, I keep reading it in pubs!). However I’m going to have a shot at having some thoughts on it, because I feel like it’s the sort of paper one should have thoughts on. Continue reading