Great news! The RSS is setting a data analysis challenge this year, sponsored by the Young Statisticians Section and Research Section of the Royal Statistical Society: Details are available on the wordpress website of the Challenge. Registration is open and the Challenge goes live on Tuesday 6 May 2014 for an exciting 6 weeks competition. (A wee bit of an unfortunate timing for those of us considering submitting a paper to NIPS!) Truly terrific, I have been looking for this kind of event to happen for many years (without finding the momentum to set it rolling…) and hope it will generate a lot of exciting activity and replicas in other societies.
Archive for Royal Statistical Society
The Journal of the Royal Statistical Society, Series B, has a new cover, a new colour and a new co-editor. As can be seen from the above shots, the colour is now a greenish ochre, with a picture of pedestrians on a brick plaza as a background, not much related to statistical methodology as far as I can tell. More importantly, the new co-editor for the coming four years is Piotr Fryzlewicz, professor at the London School of Economics, who will share the burden with Ingrid van Keilegom professor from UCL (Louvain-la-Neuve) who is now starting her third year… My friend, colleague and successor as Series B editor Gareth Roberts is now retiring after four years of hard work towards making Series B one of the top journals in Statistics. Thanks Gareth and best wishes to Ingrid and Piotr!
Here is a quote from Mervyn Stone’s discussion of the DIC paper in Series B
“The paper is rather economical with the ‘truth’. The truth of pt(Y) corresponds fixedly to the conditions of the experimental or observational set-up that ensures independent future replication Yrep or internal independence of y = (y1,…,yn) (not excluding an implicit concomitant x). For pt(Y) ≈ p(Y|θt), θt must parameterize a scientifically plausible family of alternative distributions of Y under those conditions and is therefore a necessary ‘focus’ if the ‘good [true] model’ idea is to be invoked: think of tossing a bent coin. Changing focus is not an option.”
that I found most amusing (and relevant)! Elías Moreno and I wrote our discussions from Newcastle-upon-Tyne for Series B (and arXived them as well, with a wee bit of confusion when I listed the affiliations: I am not [yet] associated with la Universidad de Las Palmas de Gran Canaria..!).
Here is the reply by Chris and Steve about my comments from yesterday:
Thanks to Christian for the comments and feedback on our paper “A General Framework for Updating Belief Distributions“. We agree with Christian that starting with a summary statistic, or statistics, is an anchor for inference or learning, providing direction and guidance for models, avoiding the alternative vague notion of attempting to model a complete data set. The latter idea has dominated the Bayesian methodology for decades, but with the advent of large and complex data sets, this is becoming increasingly challenging, if not impossible.
However, in order to do work with statistics of interest, we need to find a framework in which this direct approach can be supported by a learning strategy when the formal use of Bayes theorem is not applicable. We achieve this in the paper for a general class of loss functions, which connect observations with a target of interest. A point raised by Christian is how arbitrary these loss functions are. We do not see this at all; for if a target has been properly identified then the most primitive construct between observations informing about a target and the target would come in the form of a loss function. One should always be able to assess the loss of ascertaining a value of as an action and providing the loss in the presence of observation x. The question to be discussed is whether loss functions are objective, as in the case of the median loss,
or subjective, in the case of the choice between loss functions for estimating a location of a distribution; mean, median or mode? But our work is situated in the former position.
Previous work on loss functions, mostly in the classical literature, has spent a lot of space working out what are optimal loss functions for targets of interest. We are not really dealing with novel targets and so we can draw on the classic literature here. The work can be thought of as the Bayesian version of the M-estimator and associated ideas. In this respect we are dealing with two loss functions for updating belief distributions, one for the data, which we have just discussed, and one for the prior information, which, due to coherence principles, must be the Kullback-Leibler divergence. This raises the thorny issue of how to calibrate the two loss functions. We discuss this in the paper.
To then deal with the statistic problem, mentioned at the start of this discussion, we have found a nice way to proceed by using the loss function . How this loss function, combined with the use of the exponential family, can be used to estimate functionals of the type
is provided in the Walker talk at Bayes 250 in London, titled “The Misspecified Bayesian”, since the “model” is designed to be misspecified, a tool to estimate and learn about I only. The basic idea is to evaluate I by ensuring that we learn about the for which
This is the story of the background, we would now like to pick up in more detail on three important points that you raise in your post:
- The arbitrariness in selecting the loss function.
- The relative weighting of loss-to-data vs. loss-to-prior.
- The selection of the loss in the M-free case.
In the absence of complete knowledge of the data generating mechanism, i.e. outside of M-closed,
- We believe the statistician should weigh up the relative arbitrariness in selecting a loss function targeting the statistic of interest versus the arbitrariness of selecting a misspecified model, known not to be true, for the complete data generating mechanism. There is a wealth of literature on how to select optimal loss functions that target specific statistics, e.g. Hüber (2009) provides a comprehensive overview of how this should be done. As far as we are aware, we know of no formal procedures (that do not rely on loss functions) to select a false sampling distribution for the whole of x; see Key, Pericchi and Smith (1999).
- The relative weighting of loss-to-data vs. loss-to-prior. This is an interesting open problem. Our framework shows in the absence of M-closed or the use of self-information loss that the analyst must select this weighting. In our paper we suggest some default procedures. We have nowhere claimed these were “correct”. You raise concerns regards parameterisation and we agree with you that care is needed, but many of these issues equally hold for existing “Objective” or “Default” Bayes procedures, such as unit-information priors.
- The selection of the loss in M-free. You say “….there is no optimal choice for the substitute to the loss function…”. We disagree. Our approach is to select an established loss function that directly targets the statistic of interest, and elicit prior beliefs directly on the unknown value of this statistic. There is no notion here of a a pseudo-likelihood or where this converges to.
Thank you again to Christian for his critical observations!
Just a reminder that Bayes 250 at the RSS is taking place in less than three weeks and that it would be a good idea to register now (using a form and not an on-line page, unfortunately)! Here is the official program.
11:00 Registration and tea
11:35 Anthony O’Hagan (Warwickshire) and Dennis Lindley (Somerset) – video recorded interview
12:15 Gareth Roberts (University of Warwick) “Bayes for differential equation models”
12:45 14:00 Lunch and posters
14:00 Sylvia Richardson (MRC Biostatistics Unit) “Biostatistics and Bayes”
14:30 Dennis Prangle (Lancaster University) “Approximate Bayesian Computation”
14:50 Phil Dawid (University of Cambridge), “Putting Bayes to the Test”
16:00 Mike Jordan (UC Berkeley) “Feature Allocations, Probability Functions, and Paintboxes”
16:30 Iain Murray (University of Edinburgh) “Flexible models for density estimation”
16:50 YeeWhye Teh (University of Oxford) “MCMC for Markov and semi-Markov jump processes”
17:20 posters and drinks
09:30 Michael Goldstein (Durham University) “Geometric Bayes”
10:00 Andrew Golightly (Newcastle University), “Auxiliary particle MCMC schemes for partially observed diffusion processes”
10:20 Nicky Best (Imperial College London) “Bayesian space-time models for environmental epidemiology”
11:15 Christophe Andrieu (University of Bristol) “Inference with noisy likelihoods”
11:45 Chris Yau (Imperial College London) “Understanding cancer through Bayesian approaches”
12:05 Stephen Walker (University of Kent) “The Misspecified Bayesian”
13:30 Simon Wilson (Trinity College Dublin), “Linnaeus, Bayes and the number of species problem”
14:00 Ben Calderhead (UCL) “Probabilistic Integration for Differential Equation Models”
14:20 Peter Green (University of Bristol and UT Sydney) “Bayesian graphical model determination”
14:50 Closing Remarks Adrian Smith (University of London)