Archive for robust Bayesian methods

21w5107 [½day 3]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on December 2, 2021 by xi'an

Day [or half-day] three started without firecrackers and with David Rossell (formerly Warwick) presenting an empirical Bayes approach to generalised linear model choice with a high degree of confounding, using approximate Laplace approximations. With considerable improvements in the experimental RMSE. Making feeling sorry there was no apparent fully (and objective?) Bayesian alternative! (Two more papers on my reading list that I should have read way earlier!) Then Veronika Rockova discussed her work on approximate Metropolis-Hastings by classification. (With only a slight overlap with her One World ABC seminar.) Making me once more think of Geyer’s n⁰564 technical report, namely the estimation of a marginal likelihood by a logistic discrimination representation. Her ABC resolution replaces the tolerance step by an exponential of minus the estimated Kullback-Leibler divergence between the data density and the density associated with the current value of the parameter. (I wonder if there is a residual multiplicative constant there… Presumably not. Great idea!) The classification step need be run at every iteration, which could be sped up by subsampling.

On the always fascinating theme of loss based posteriors, à la Bissiri et al., Jack Jewson (formerly Warwick) exposed his work generalised Bayesian and improper models (from Birmingham!). Using data to decide between model and loss, which sounds highly unorthodox! First difficulty is that losses are unscaled. Or even not integrable after an exponential transform. Hence the notion of improper models. As in the case of robust Tukey’s loss, which is bounded by an arbitrary κ. Immediately I wonder if the fact that the pseudo-likelihood does not integrate is important beyond the (obvious) absence of a normalising constant. And the fact that this is not a generative model. And the answer came a few slides later with the use of the Hyvärinen score. Rather than the likelihood score. Which can itself be turned into a H-posterior, very cool indeed! Although I wonder at the feasibility of finding an [objective] prior on κ.

Rajesh Ranganath completed the morning session with a talk on [the difficulty of] connecting Bayesian models and complex prediction models. Using instead a game theoretic approach with Brier scores under censoring. While there was a connection with Veronika’s use of a discriminator as a likelihood approximation, I had trouble catching the overall message…


Posted in Statistics with tags , , , , , , , on March 24, 2021 by xi'an

An AISTATS 2021 paper by Masahiro Fujisawa,Takeshi Teshima, Issei Sato and Masashi Sugiyama (RIKEN, Tokyo) just appeared on arXiv.  (AISTATS 2021 is again virtual this year.)

“ABC can be sensitive to outliers if a data discrepancy measure is chosen inappropriately (…) In this paper, we propose a novel outlier-robust and computationally-efficient discrepancy measure based on the γ-divergence”

The focus is on measure of robustness for ABC distances as those can be lethal if insufficient summarisation is used. (Note that a referenced paper by Erlis Ruli, Nicola Sartori and Laura Ventura from Padova appeared last year on robust ABC.) The current approach mixes the γ-divergence of Fujisawa and Eguchi, with a k-nearest neighbour density estimator. Which may not prove too costly, of order O(n log n), but also may be a poor if robust approximation, even if it provides an asymptotic unbiasedness and almost surely convergent approximation. These properties are those established in the paper, which only demonstrates convergence in the sample size n to an ABC approximation with the true γ-divergence but with a fixed tolerance ε, when the most recent results are rather concerned with the rates of convergence of ε(n) to zero. (An extensive simulation section compares this approach with several ABC alternatives, incl. ours using the Wasserstein distance. If I read the comparison graphs properly, it does not look as if there is a huge discrepancy between the two approaches under no contamination.) Incidentally, the paper contains a substantial survey section and has a massive reference list, if missing the publication more than a year earlier of our Wasserstein paper in Series B.

end-to-end Bayesian learning [CIRM]

Posted in Books, Kids, Mountains, pictures, Running, Statistics, University life with tags , , , , , , , , , , , , , , , , , on February 1, 2021 by xi'an

Next Fall, there will be a workshop at CIRM, Luminy, Marseilles, on Bayesian learning. It takes place 22-29 October 2021 on this wonderful campus at the border with the beautiful Parc National des Calanques, in a wonderfully renovated CIRM building and involves friends and colleagues of mine as organisers and plenary speakers. (I am not involved!, but plan to organise a scalable MCMC workshop there the year after!) The conference is well-supported and the housing fees will be minimal since the centre is also subsidized by CNRS. The deadline for contributed talks and posters is 22 March, while it is 15 June for registration. Hopefully by this time the horizon will have cleared up enough to consider traveling and meeting again. Hopefully. (In which case I will miss this wonderful conference due to other meeting and teaching commitments in the Fall.)

%d bloggers like this: