Bayesian inference with no likelihood

This week I made a quick trip to Warwick for the defence (or viva) of the PhD thesis of Jack Jewson, containing novel perspectives on constructing Bayesian inference without likelihood or without complete trust in said likelihood. The thesis aimed at constructing minimum divergence posteriors in an M-open perspective and built a rather coherent framework from principles to implementation. There is a clear link with the earlier work of Bissiri et al. (2016), with further consistency constraints where the outcome must recover the true posterior in the M-closed scenario (if not always the case with the procedures proposed in the thesis).

Although I am partial to the use of empirical likelihoods in setting, I appreciated the position of the thesis and the discussion of the various divergences towards the posterior derivation (already discussed on this blog) , with interesting perspectives on the calibration of the pseudo-posterior à la Bissiri et al. (2016). Among other things, the thesis pointed out a departure from the likelihood principle and some of its most established consequences, like Bayesian additivity. In that regard, there were connections with generative adversarial networks (GANs) and their Bayesian versions that could have been explored. And an impression that the type of Bayesian robustness explored in the thesis has more to do with outliers than with misspecification. Epsilon-contamination amodels re quite specific as it happens, in terms of tails and other things.

The next chapter is somewhat “less” Bayesian in my view as it considers a generalised form of variational inference. I agree that the view of the posterior as a solution to an optimisation is tempting but changing the objective function makes the notion less precise.  Which makes reading it somewhat delicate as it seems to dilute the meaning of both prior and posterior to the point of becoming irrelevant.

The last chapter on change-point models is quite alluring in that it capitalises on the previous developments to analyse a fairly realistic if traditional problem, applied to traffic in London, prior and posterior to the congestion tax. However, there is always an issue with robustness and outliers in that the notion is somewhat vague or informal. Things start clarifying at the end but I find surprising that conjugates are robust optimal solutions since the usual folk theorem from the 80’s is that they are not robust.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.