Archive for objective Bayes

a case for Bayesian deep learnin

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , on September 30, 2020 by xi'an

Andrew Wilson wrote a piece about Bayesian deep learning last winter. Which I just read. It starts with the (posterior) predictive distribution being the core of Bayesian model evaluation or of model (epistemic) uncertainty.

“On the other hand, a flat prior may have a major effect on marginalization.”

Interesting sentence, as, from my viewpoint, using a flat prior is a no-no when running model evaluation since the marginal likelihood (or evidence) is no longer a probability density. (Check Lindley-Jeffreys’ paradox in this tribune.) The author then goes for an argument in favour of a Bayesian approach to deep neural networks for the reason that data cannot be informative on every parameter in the network, which should then be integrated out wrt a prior. He also draws a parallel between deep ensemble learning, where random initialisations produce different fits, with posterior distributions, although the equivalent to the prior distribution in an optimisation exercise is somewhat vague.

“…we do not need samples from a posterior, or even a faithful approximation to the posterior. We need to evaluate the posterior in places that will make the greatest contributions to the [posterior predictive].”

The paper also contains an interesting point distinguishing between priors over parameters and priors over functions, ony the later mattering for prediction. Which must be structured enough to compensate for the lack of data information about most aspects of the functions. The paper further discusses uninformative priors (over the parameters) in the O’Bayes sense as a default way to select priors. It is however unclear to me how this discussion accounts for the problems met in high dimensions by standard uninformative solutions. More aggressively penalising priors may be needed, as those found in high dimension variable selection. As in e.g. the 10⁷ dimensional space mentioned in the paper. Interesting read all in all!

O’Bayes 19/1 [snapshots]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , , on June 30, 2019 by xi'an

Although the tutorials of O’Bayes 2019 of yesterday were poorly attended, albeit them being great entries into objective Bayesian model choice, recent advances in MCMC methodology, and the multiple layers of BART, for which I have to blame myself for sticking the beginning of O’Bayes too closely to the end of BNP as only the most dedicated could achieve the commuting from Oxford to Coventry to reach Warwick in time, the first day of talks were well attended, despite weekend commitments, conference fatigue, and perfect summer weather! Here are some snapshots from my bench (and apologies for not covering better the more theoretical talks I had trouble to follow, due to an early and intense morning swimming lesson! Like Steve Walker’s utility based derivation of priors that generalise maximum entropy priors. But being entirely independent from the model does not sound to me like such a desirable feature… And Natalia Bochkina’s Bernstein-von Mises theorem for a location scale semi-parametric model, including a clever construct of a mixture of two Dirichlet priors to achieve proper convergence.)

Jim Berger started the day with a talk on imprecise probabilities, involving the society for imprecise probability, which I discovered while reading Keynes’ book, with a neat resolution of the Jeffreys-Lindley paradox, when re-expressing the null as an imprecise null, with the posterior of the null no longer converging to one, with a limit depending on the prior modelling, if involving a prior on the bias as well, with Chris discussing the talk and mentioning a recent work with Edwin Fong on reinterpreting marginal likelihood as exhaustive X validation, summing over all possible subsets of the data [using log marginal predictive].Håvard Rue did a follow-up talk from his Valencià O’Bayes 2015 talk on PC-priors. With a pretty hilarious introduction on his difficulties with constructing priors and counseling students about their Bayesian modelling. With a list of principles and desiderata to define a reference prior. However, I somewhat disagree with his argument that the Kullback-Leibler distance from the simpler (base) model cannot be scaled, as it is essentially a log-likelihood. And it feels like multivariate parameters need some sort of separability to define distance(s) to the base model since the distance somewhat summarises the whole departure from the simpler model. (Håvard also joined my achievement of putting an ostrich in a slide!) In his discussion, Robin Ryder made a very pragmatic recap on the difficulties with constructing priors. And pointing out a natural link with ABC (which brings us back to Don Rubin’s motivation for introducing the algorithm as a formal thought experiment).

Sara Wade gave the final talk on the day about her work on Bayesian cluster analysis. Which discussion in Bayesian Analysis I alas missed. Cluster estimation, as mentioned frequently on this blog, is a rather frustrating challenge despite the simple formulation of the problem. (And I will not mention Larry’s tequila analogy!) The current approach is based on loss functions directly addressing the clustering aspect, integrating out the parameters. Which produces the interesting notion of neighbourhoods of partitions and hence credible balls in the space of partitions. It still remains unclear to me that cluster estimation is at all achievable, since the partition space explodes with the sample size and hence makes the most probable cluster more and more unlikely in that space. Somewhat paradoxically, the paper concludes that estimating the cluster produces a more reliable estimator on the number of clusters than looking at the marginal distribution on this number. In her discussion, Clara Grazian also pointed the ambivalent use of clustering, where the intended meaning somehow diverges from the meaning induced by the mixture model.

O’Bayes 2019 has now started!

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , on June 28, 2019 by xi'an

The O’Bayes 2019 conference in Warwick University has now started, with about 100 participants meeting over four days (plus one of tutorials) in the Zeeman maths building of the University. Quite a change of location and weather when compared with the previous one in Austin. As an organiser I hope all goes well at the practical level and want to thank the other persons who helped me towards this goal, first and foremost Paula Matthews who solved web and lodging and planning issues all over these past months, as well as Mark Steel and Cristiano Villa. As a member of the scientific committee, I am looking forward the talks and discussants along the coming four days, again hoping all speakers and discussants show up and are not hindered by travel or visa issues…

O’Bayes 2019 conference program

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on May 13, 2019 by xi'an

The full and definitive program of the O’Bayes 2019 conference in Warwick is now on line. Including discussants for all papers. And the three [and free] tutorials on Friday afternoon, 28 June, on model selection (M. Barbieri), MCMC recent advances (G.O. Roberts) and BART (E.I. George). Registration remains open at the reduced rate and submissions of posters can still be sent to me for all conference participants.

O’Bayes 2019: speakers, discussants, posters!

Posted in pictures, Statistics, University life with tags , , , , on January 2, 2019 by xi'an

The program for the next O’Bayes conference in Warwick, 28 June-02 July, 2019, is now set. Speakers and discussants have been contacted by the scientific committee and accepted our invitation! As usual, there will be poster sessions on the nights of 29 and 30 June and the call is open for poster submissions, until January 31, to be sent to me as a one page pdf document containing either the poster itself or title, abstract and references. (Using my email address at either Dauphine or Warwick is fine. Or bayesianstatistics on gmail.)