## Terry Tao on Bayes… and Trump

Posted in Books, Kids, Statistics, University life with tags , , , , , , , on June 13, 2016 by xi'an

“From the perspective of Bayesian probability, the grade given to a student can then be viewed as a measurement (in logarithmic scale) of how much the posterior probability that the student’s model was correct has improved over the prior probability.” T. Tao, what’s new, June 1

An interesting and more Bayesian last question from Terry Tao is about what to do when the probabilities themselves are uncertain. More Bayesian because this is where I would introduce a prior model on this uncertainty, in a hierarchical fashion, in order to estimate the true probabilities. (A non-informative prior makes its way into the comments.) Of course, all this leads to a lot of work given the first incentive of asking multiple choice questions…

One may wonder at the link with scary Donald and there is none! But the next post by Terry Tao is entitled “It ought to be common knowledge that Donald Trump is not fit for the presidency of the United States of America”. And unsurprisingly, as an opinion post, it attracted a large number of non-mathematical comments.

## Assessing models when all models are false

Posted in Statistics, University life with tags , , , , , on November 11, 2010 by xi'an

When I arrived home from Philadelphia, I got the news that John Geweke was giving a seminar at CREST in the early afternoon and thus biked there to attend his talk. The topic was about comparing asset return models, but what interested most was the notion of comparing models without a reference to a true model, a difficulty I have been juggling with for quite a while (at least since the goodness-of-fit paper with Judith Rousseau we never resubmitted!). And for which I still do not find a satisfactory (Bayesian) solution.

Because there is no true model, Durham and Geweke use the daily (possibly Bayesian) predictive

$p(y_t|Y^o_{t-1},X^o_{t-1})$

as their basis for model assessment and rely on a log scoring rule

$\sum_{s=1}^{t-1} \log p_s(y^o_s|Y^o_{s-1},X^o_{s-1})$

to compare models. (The ‘o’ in the superscript denotes the observed values.) As reported in the paper this is a proper (or honest) scoring rule. If n models are under competition, a weighted (model) predictive average

$\sum_{i=1}^n \,\omega_{s-1;i} p^i_s(y_s|Y^o_{s-1},X^o_{s-1})$

can be considered and the paper examines the impact of picking the optimal weight vector $(\omega_{t-1,1},\ldots,\omega_{t-1,n})$ against the log scoring rule, i.e.

$\arg\max_{\mathbf{\omega}_{t-1}} \sum_{s=1}^{t-1} \sum_{i=1}^n \,\omega_{t-1;i} \log p^i_s(y^o_s|Y^o_{s-1},X^o_{s-1})$

The weight vector at time t-1 is therefore optimising the backward sequence of predictions of the observed values till time t-1. The interesting empirical result from this study is that, even from a Bayesian perspective, the weights never degenerate, unless one of the models is correct (which is rarely the case!). Thus, even after very long series of observations, the weights of different models remain away from zero (while the Bayesian posterior probability of a single model goes to one). Even though I am not yet at the point of adopting this solution (in particular because it seems to be using the data twice, once through the posterior/predictive and once through the score), I find the approach quite intriguing and hope I can study it further. Maybe a comparison with a Bayesian non-parametric evaluation would make sense…