Archive for the University life Category

Scott Sisson’s ABC seminar in Paris [All about that Bayes]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on January 20, 2020 by xi'an

On the “All about that Bayes” seminar tomorrow (Tuesday 21 at 3p.m., room 42, AgroParisTech, 16 rue Claude Bernard, Paris 5ième), Scott Sisson, School of Mathematics and Statistics at UNSW, and visiting Paris-Dauphine this month, will give a talk on

Approximate posteriors and data for Bayesian inference

Abstract
For various reasons, including large datasets and complex models, approximate inference is becoming increasingly common. In this talk I will provide three vignettes of recent work. These cover a) approximate Bayesian computation for Gaussian process density estimation, b) likelihood-free Gibbs sampling, and c) MCMC for approximate (rounded) data.

an elegant sampler

Posted in Books, Kids, R, University life with tags , , , , , , , on January 15, 2020 by xi'an

Following an X validated question on how to simulate a multinomial with fixed average, W. Huber produced a highly elegant and efficient resolution with the compact R code

tabulate(sample.int((k-1)*n, s-n) %% n + 1, n) + 1

where k is the number of classes, n the number of draws, and s equal to n times the fixed average. The R function sample.int is an alternative to sample that seems faster. Breaking the outcome of

sample.int((k-1)*n, s-n)

as nonzero positions in an n x (k-1) matrix and adding a adding a row of n 1’s leads to a simulation of integers between 1 and k by counting the 1’s in each of the n columns, which is the meaning of the above picture. Where the colour code is added after counting the number of 1’s. Since there are s 1’s in this matrix, the sum is automatically equal to s. Since the s-n positions are chosen uniformly over the n x (k-1) locations, the outcome is uniform. The rest of the R code is a brutally efficient way to translate the idea into a function. (By comparison, I brute-forced the question by suggesting a basic Metropolis algorithm.)

BAYSM 2020, Kunming, China [reposted]

Posted in Kids, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , on January 13, 2020 by xi'an

The 5th Bayesian Young Statisticians Meeting, BAYSM2020, will take place in Kunming, China (June 26-27, 2020) as a satellite to the ISBA 2020 world meeting. BAYSM is the official conference of j-ISBA, the junior section of the International Society for Bayesian Analysis. It is intended for Ph.D. Students, M.S. Students, Post-Docs, Young and Junior researchers working in the field of Bayesian statistics, providing an opportunity to connect with the Bayesian community at large. Senior discussants will be present at each session, providing participants with hints, suggestions and comments to their work. Distinguished professors of the Bayesian community will also participate as keynote speakers, making an altogether exciting program.

Registration is now open (https://baysm2020.uconn.edu/registration) and will be available with an early bird discount until May 1, 2020. The event will be hosted at the Science Hall of Yunnan University (Kunming, China) right before ISBA 2020 world meeting. BAYSM 2020 will include social events, providing the opportunity to get to know other junior Bayesians.

Young researchers interested in giving a talk or presenting a poster are invited to submit an extended abstract by March 29, 2020. All the instructions for the abstract submission are reported at the page https://baysm2020.uconn.edu/call-dates

Thanks to the generous support of ISBA, a number of travel awards are available to support young researchers.

Keynote speakers:
Maria De Iorio
David Dunson
Sylvia Frühwirth-Schnatter
Xuanlong Nguyen
Amy Shi
Jessica Utts

Confirmed discussants:
Jingheng Cai
Li Ma
Fernando Quintana
Francesco Stingo
Anmin Tang
Yemao Xia

While the meeting is organized for and by junior Bayesians, attendance is open to anyone who may be interested. For more information, please visit the conference website: https://baysm2020.uconn.edu/

postdoc at Warwick on robust SMC [call]

Posted in Kids, pictures, R, Statistics, University life with tags , , , , , , , , on January 11, 2020 by xi'an

Here is a call for a research fellow at the University of Warwick to work with Adam Johansen and Théo Damoulas on the EPSRC and Lloyds Register Foundaton funded project “Robust Scalable Sequential Monte Carlo with application to Urban Air Quality”. To quote

The position will be based primarily at the Department of Statistics of the University of Warwick. The post holder will work closely in collaboration with the rest of the project team and another postdoctoral researcher to be recruited shortly to work within the Data Centric Engineering programme at the Alan Turing Institute in London. The post holder will be expected to visit the Alan Turing Institute regularly.

Candidates with strong backgrounds in the mathematical analysis of stochastic algorithms or sequential Monte Carlo methods are particularly encouraged to apply. Closing date is 19 Jan 2020.

BayesComp’20

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on January 10, 2020 by xi'an

First, I really have to congratulate my friend Jim Hobert for a great organisation of the meeting adopting my favourite minimalist principles (no name tag, no “goodies” apart from the conference schedule, no official talks). Without any pretense at objectivity, I also appreciated very much the range of topics and the sweet frustration of having to choose between two or three sessions each time. Here are some notes taken during some talks (with no implicit implication for the talks no mentioned, re. above frustration! as well as very short nights making sudden lapse in concentration highly likely).

On Day 1, Paul Fearnhead’s inaugural plenary talk was on continuous time Monte Carlo methods, mostly bouncy particle and zig-zag samplers, with a detailed explanation on the simulation of the switching times which likely brought the audience up to speed even if they had never heard of them. And an opening on PDMPs used as equivalents to reversible jump MCMC, reminding me of the continuous time (point process) solutions of Matthew Stephens for mixture inference (and of Preston, Ripley, Møller).

The same morn I heard of highly efficient techniques to handle very large matrices and p>n variables selections by Akihiko Nishimura and Ruth Baker on a delayed acceptance ABC, using a cheap proxy model. Somewhat different from indirect inference. I found the reliance on ESS somewhat puzzling given the intractability of the likelihood (and the low reliability of the frequency estimate) and the lack of connection with the “real” posterior. At the same ABC session, Umberto Picchini spoke on a joint work with Richard Everitt (Warwick) on linking ABC and pseudo-marginal MCMC by bootstrap. Actually, the notion of ABC likelihood was already proposed as pseudo-marginal ABC by Anthony Lee, Christophe Andrieu and Arnaud Doucet in the discussion of Fearnhead and Prangle (2012) but I wonder at the focus of being unbiased when the quantity is not the truth, i.e. the “real” likelihood. It would seem more appropriate to attempt better kernel estimates on the distribution of the summary itself. The same session also involved David Frazier who linked our work on ABC for misspecified models and an on-going investigation of synthetic likelihood.

Later, there was a surprise occurrence of the Bernoulli factory in a talk by Radu Herbei on Gaussian process priors with accept-reject algorithms, leading to exact MCMC, although the computing implementation remains uncertain. And several discussions during the poster session, incl. one on the planning of a 2021 workshop in Oaxaca centred on objective Bayes advances as we received acceptance of our proposal by BIRS today!

On Day 2, David Blei gave a plenary introduction to variational Bayes inference and latent Dirichlet allocations, somewhat too introductory for my taste although other participants enjoyed this exposition. He also mentioned a recent JASA paper on the frequentist consistency of variational Bayes that I should check. Speaking later with PhD students, they really enjoyed this opening on an area they did not know that well.

A talk by Kengo Kamatani (whom I visited last summer) on improved ergodicity rates for heavy tailed targets and Crank-NIcholson modifications to the random walk proposal (which uses an AR(1) representation instead of the random walk). With the clever idea of adding the scale of the proposal as an extra parameter with a prior of its own. Gaining one order of magnitude in the convergence speed (i.e. from d to 1 and from d² to d, where d is the dimension), which is quite impressive (and just published in JAP).Veronica Rockova linked Bayesian variable selection and machine learning via ABC, with conditions on the prior for model consistency. And a novel approach using part of the data to learn an ABC partial posterior, which reminded me of the partial  Bayes factors of the 1990’s although it is presumably unrelated. And a replacement of the original rejection ABC via multi-armed bandits, where each variable is represented by an arm, called ABC Bayesian forests. Recalling the simulation trick behind Thompson’s approach, reproduced for the inclusion or exclusion of variates and producing a fixed estimate for the (marginal) inclusion probabilities, which makes it sound like a prior-feeback form of empirical Bayes. Followed by a talk of Gregor Kastner on MCMC handling of large time series with specific priors and a massive number of parameters.

The afternoon also had a wealth of exciting talks and missed opportunities (in the other sessions!). Which ended up with a strong if unintended French bias since I listened to Christophe Andrieu, Gabriel Stolz, Umut Simsekli, and Manon Michel on different continuous time processes, with Umut linking GANs, multidimensional optimal transport, sliced-Wasserstein, generative models, and new stochastic differential equations. Manon Michel gave a highly intuitive talk on creating non-reversibility, getting rid of refreshment rates in PDMPs to kill any form of reversibility.

Hastings 50 years later

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , on January 9, 2020 by xi'an

What is the exact impact of the Metropolis-Hastings algorithm on the field of Bayesian statistics? and what are the new tools of the trade? What I personally find the most relevant and attractive element in a review on the topic is the current role of this algorithm, rather than its past (his)story, since many such reviews have already appeared and will likely continue to appear. What matters most imho is how much the Metropolis-Hastings algorithm signifies for the community at large, especially beyond academia. Is the availability or unavailability of software like BUGS or Stan a help or an hindrance? Was Hastings’ paper the start of the era of approximate inference or the end of exact inference? Are the algorithm intrinsic features like Markovianity a fundamental cause for an eventual extinction because of the ensuing time constraint and the lack of practical guarantees of convergence and the illusion of a fully automated version? Or are emerging solutions like unbiased MCMC and asynchronous algorithms a beacon of hope?

In their Biometrika paper, Dunson and Johndrow (2019) recently wrote a celebration of Hastings’ 1970 paper in Biometrika, where they cover adaptive Metropolis (Haario et al., 1999; Roberts and Rosenthal, 2005), the importance of gradient based versions toward universal algorithms (Roberts and Tweedie, 1995; Neal, 2003), discussing the advantages of HMC over Langevin versions. They also recall the significant step represented by Peter Green’s (1995) reversible jump algorithm for multimodal and multidimensional targets, as well as tempering (Miasojedow et al., 2013; Woodard et al., 2009). They further cover intractable likelihood cases within MCMC (rather than ABC), with the use of auxiliary variables (Friel and Pettitt, 2008; Møller et al., 2006) and pseudo-marginal MCMC (Andrieu and Roberts, 2009; Andrieu and Vihola, 2016). They naturally insist upon the need to handle huge datasets, high-dimension parameter spaces, and other scalability issues, with links to unadjusted Langevin schemes (Bardenet et al., 2014; Durmus and Moulines, 2017; Welling and Teh, 2011). Similarly, Dunson and Johndrow (2019) discuss recent developments towards parallel MCMC and non-reversible schemes such as PDMP as highly promising, with a concluding section on the challenges of automatising and robustifying much further the said procedures, if only to reach a wider range of applications. The paper is well-written and contains a wealth of directions and reflections, including those in my above introduction. Here are some mostly disconnected directions I would have liked to see covered or more covered

  1. convergence assessment today, e.g. the comparison of various approximation schemes
  2. Rao-Blackwellisation and other post-processing improvements
  3. other approximate inference tools than the pseudo-marginal MCMC
  4. importance of the parameterisation of the problem for convergence
  5. dimension issues and connection with quasi-Monte Carlo
  6. constrained spaces of measure zero, as for instance matrix distributions imposing zeros outside a diagonal band
  7. given the rise of the machine(-learners), are exploratory and intrinsically slow algorithms like MCMC doomed or can both fields feed one another? The section on optimisation could be expanded in that direction
  8. the wasteful nature of the random walk feature of MCMC algorithms, as opposed to non-reversible kernels like HMC and other PDMPs, missing from the gradient based methods section (and can we once again learn from physicists?)
  9. finer convergence issues and hence inference difficulties with complex MCMC algorithms like Gibbs samplers with incompatible conditionals
  10. use of the Hastings ratio in other algorithms like ABC or EP (in link with the section on generalised Bayes)
  11. adapting Metropolis-Hastings methods for emerging computing tools like GPUs and quantum computers

or possibly less covered, namely data augmentation put forward when it is a special case of auxiliary variables as in slice sampling and in earlier physics literature. For instance, both probit and logistic regressions do not truly require data augmentation and are more toy examples than really challenging applications. The approach of Carlin & Chib (1995) is another illustration, which has met with recent interest, despite requiring heavy calibration (just like RJMCMC). As well as a a somewhat awkward opposition between Gibbs and Hastings, in that I am not convinced that Gibbs does not remain ultimately necessary to handle high dimension problems, in the sense that the alternative solutions like Langevin, HMC, or PDMP, or…, are relying on Euclidean assumptions for the entire vector, while a direct product of Euclidean structures may prove more adequate.

Panch at the helm!

Posted in pictures, Travel, University life with tags , , , , , , , , , , , , on January 8, 2020 by xi'an

Reading somewhat by chance a Nature article on the new Director of the National Science Foundation (NSF) nominated by Trump (and yet to be confirmed by the Senate), I found that his name Sethuraman Panchanathan was the name of a friend of my wife 30⁺ years ago when they were both graduate students in image processing at the University of Ottawa, Department of Electrical Engineering… And looking further into the matter, I realised that this was indeed the very friend we knew from that time, with whom w shared laughs, dinners, and a few day trips together around Ottawa! While this is not the ultimate surprise, given that science administration is usually run by scientists, taken from a population pool that is not that large, as exemplified by earlier cases at the national or European level where I had some acquaintance with a then senior officer, it is nonetheless striking (and fun) to hear of a friend moving to a high visibility position after such a long gap. (When comparing NSF and ERC, the European Research Council, with French mathematician Jean-Pierre Bourguignon as current director also appearing in a recent Nature article, I was surprised to see that the ERC budget was more than twice the NSF budget.) Well, good luck to him for sailing these highly political waters!