Archive for Tokyo

Bayesian learning

Posted in Statistics with tags , , , , , , , , on May 4, 2023 by xi'an

“…many well-known learning-algorithms, such as those used in optimization, deep learning, and machine learning in general, can now be derived directly following the above scheme using a single algorithm”

The One World ABC webinar today was delivered by Emtiyaz Khan (RIKEN), about the Bayesian Learning Rule, following Khan and Rue 2021 arXival on Bayesian learning. (It had a great intro featuring a video of the speaker’s daughter learning about the purpose of a ukulele in her first year!) The paper argues about a Bayesian interpretation/version of gradient descent algorithms, starting with Zellner’s (1988, the year I first met him!) identity that the posterior is solution to

\min_q \mathbb E_q[\ell(\theta,x)] + KL(q||\pi)

when ℓ is the likelihood and π the prior. This identity can be generalised to an arbitrary loss function (also dependent on the data)  replacing the likelihood and considered for a posterior chosen within an exponential family just as variational Bayes. Ending up with a posterior adapted to this target (in the KL sense). The optimal hyperparameter or pseudo-hyperparameter of this approximation can be recovered by some gradient algorithm, recovering as well stochastic gradient and Newton’s methods. While constructing a prior out of a loss function would have pleased the late Herman Rubin, this is not the case, but rater an approach to deriving a generalised Bayes distribution within a parametric family, including mixtures of Gaussians. At some point in the talk, the uncertainty endemic to the Bayesian approach seeped back into the picture, but since most of the intuition came from machine learning, I was somewhat lost at the nature of this uncertainty.

 

 

the Bayesian learning rule [One World ABC’minar, 27 April]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on April 24, 2023 by xi'an

The next One World ABC seminar is taking place (on-line, requiring pre-registration) on 27 April, 9:30am UK time, with Mohammad Emtiyaz Khan (RIKEN-AIP, Tokyo) speaking about the Bayesian learning rule:

We show that many machine-learning algorithms are specific instances of a single algorithm called the Bayesian learning rule. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models. This includes classical algorithms such as ridge regression, Newton’s method, and Kalman filter, as well as modern deep-learning algorithms such as stochastic-gradient descent, RMSprop, and Dropout. The key idea in deriving such algorithms is to approximate the posterior using candidate distributions estimated by using natural gradients. Different candidate distributions result in different algorithms and further approximations to natural gradients give rise to variants of those algorithms. Our work not only unifies, generalizes, and improves existing algorithms, but also helps us design new ones.

Concentration and robustness of discrepancy-based ABC [One World ABC ‘minar, 28 April]

Posted in Statistics, University life with tags , , , , , , , , , , , on April 15, 2022 by xi'an

Our next speaker at the One World ABC Seminar will be Pierre Alquier, who will talk about “Concentration and robustness of discrepancy-based ABC“, on Thursday April 28, at 9.30am UK time, with an abstract reported below.
Approximate Bayesian Computation (ABC) typically employs summary statistics to measure the discrepancy among the observed data and the synthetic data generated from each proposed value of the parameter of interest. However, finding good summary statistics (that are close to sufficiency) is non-trivial for most of the models for which ABC is needed. In this paper, we investigate the properties of ABC based on integral probability semi-metrics, including MMD and Wasserstein distances. We exhibit conditions ensuring the contraction of the approximate posterior. Moreover, we prove that MMD with an adequate kernel leads to very strong robustness properties.

a journal of the plague year² [reopenings]

Posted in Books, Kids, pictures, Travel, University life, Wines with tags , , , , , , , , , , , , , , , , , , , , , , , , on September 30, 2021 by xi'an

Returned to some face-to-face teaching at Université Paris Dauphine for the new semester. With the students having to be frequently reminded of keeping face masks on (yes, the nose is part of the face and need be covered!). I do not understand why the COVID pass did not apply to universities as well. I also continued an on-line undergrad lecture in mathematical statistics, as I found that the amount of information provided to students this way was superior to black-board teaching. (I actually gave some of these lectures in a uni amphitheatre, to leave the students free to chose, but less than 20% showed up.)

Read the very last volume of the Witcher. With a sense of relief that it was over, even though the plot and the writing were altogether pleasant… And Naomi Novik’s Uprooted, with a permanent feeling of amazement at this novel been praised or awarded anything. Once more, I had missed that it was a YA [but not too young!] novel. Still, so many things go wrong, from the overly obtuse main character to the transparent plot, the highly questionable romantic affair between the 100⁺ year old wizard and the 17 year old teenager he more than less ravished from friends and family, to the poor construct of the magic system, and to the (spoiler alert!) rosy ending. As I read the book over two sleepless nights, not much time was lost. And it had some page-turning qualities. But I’d rather have slept better!

Watched Kate, thinking it was a Japanese film, but quickly found to my sorrow it was not. Not Japanese in the least, except for taking place in Tokyo and involving cartoonesque yakuza. To quote the NYT, “as cheap as a whiff of a green tea and musk cologne called Tokyo wafting over a department store counter”. Simply terrible, even lacking the pretense of story distanciation found in Kill Bill… And then came by chance on Time and Tide, a 2000 Hong Kong film, a much better distanced action picture, with enough ellipses and plenty second-degree dialogues, some mixing Cantonese and Portuguese, plus highly original central male and female characters. I am wondering if the same could be filmed today, given the chokehold of the PCC on the Hong Kongese society and the growing censorship of films there.

Had a great month with our garden tomatoes, as we ate most of them. With a dry spell that stopped the spread of mildew and the aggression of slugs. And had a steady flow of strawberries, a second harvest that is not yet over. And more recently (late) figs, although I bring most of them to the department. The fig harvest seems to be less plentiful than last year…  The last and final product of the garden will be a collection of huge butternuts that spontaneously grew out of last year seeds.

approximate Bayesian inference [survey]

Posted in Statistics with tags , , , , , , , , , , , , , , , , , , on May 3, 2021 by xi'an

In connection with the special issue of Entropy I mentioned a while ago, Pierre Alquier (formerly of CREST) has written an introduction to the topic of approximate Bayesian inference that is worth advertising (and freely-available as well). Its reference list is particularly relevant. (The deadline for submissions is 21 June,)

%d bloggers like this: