Archive for ABC

postdoctoral research position

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on April 27, 2023 by xi'an

Through the ERC Synergy grant OCEAN (On intelligenCE And Networks: Synergistic research in Bayesian Statistics, Microeconomics and Computer Sciences), I am seeking one postdoctoral researcher with an interest in Bayesian federated learning, distributed MCMC, approximate Bayesian inference, and data privacy.

The project is based at Université Paris Dauphine, on the new PariSanté Campus.  The postdoc will join the OCEAN teams of researchers directed by Éric Moulines and Christian Robert to work on the above themes with multiple focus from statistical theory, to Bayesian methodology, to algorithms, to medical applications.

Qualifications

The candidate should hold a doctorate in statistics or machine learning, with demonstrated skills in Bayesian analysis and Monte Carlo methodology, a record of publications in these domains, and an interest in working as part of an interdisciplinary international team. Scientific maturity and research autonomy are a must for applying.

Funding

Besides a 2 year postdoctoral contract at Université Paris Dauphine (with possible extension for one year), at a salary of 31K€ per year, the project will fund travel to OCEAN partners’ institutions (University of Warwick or University of Berkeley) and participation to yearly summer schools. University benefits are attached to the position and no teaching duty is involved, as per ERC rules.

The postdoctoral work will begin 1 September 2023.

Application Procedure

To apply, preferably before 31 May, please send the following in one pdf to Christian Robert (bayesianstatistics@gmail.com).

  • a letter of application,
  • a CV,
  • letters of recommendation sent directly by recommenders

the Bayesian learning rule [One World ABC’minar, 27 April]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on April 24, 2023 by xi'an

The next One World ABC seminar is taking place (on-line, requiring pre-registration) on 27 April, 9:30am UK time, with Mohammad Emtiyaz Khan (RIKEN-AIP, Tokyo) speaking about the Bayesian learning rule:

We show that many machine-learning algorithms are specific instances of a single algorithm called the Bayesian learning rule. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models. This includes classical algorithms such as ridge regression, Newton’s method, and Kalman filter, as well as modern deep-learning algorithms such as stochastic-gradient descent, RMSprop, and Dropout. The key idea in deriving such algorithms is to approximate the posterior using candidate distributions estimated by using natural gradients. Different candidate distributions result in different algorithms and further approximations to natural gradients give rise to variants of those algorithms. Our work not only unifies, generalizes, and improves existing algorithms, but also helps us design new ones.

ABC with privacy

Posted in Books, Statistics with tags , , , , , , , , on April 18, 2023 by xi'an


I very recently read a  2021 paper by Mijung Park, Margarita Vinaroz, and Wittawat Jitkrittum on running ABC while ensuring data privacy (published in Entropy).

“…adding noise to the distance computed on the real observations and pseudo-data suffices the privacy guarantee of the resulting  posterior samples”

For ABC tolerance, they use maximum mean discrepancy (MMD) and for privacy the standard if unconvincing notion of differential privacy, defined by ensuring an upper bound on the amount of variation in the probability ratio when replacing/removing/adding an observation. (But not clearly convincing users their data is secure.)

While I have no reservation about the validation of the double-noise approach, I find it surprising that noise must be (twice) added when vanilla ABC is already (i) noisy, since based on random pseudo-data, and (ii) producing only a sample from an approximate posterior instead of returning an exact posterior. My impression indeed was that ABC should be good enough by itself to achieve privacy protection. In the sense that the accepted parameter values were those that generated random samples sufficiently close to the actual data, hence not only compatible with the true data, but also producing artificial datasets that are close enough to the data. Presumably these artificial datasets should not be produced as the intersection of their ε neighbourhoods may prove enough to identify the actual data. (The proposed algorithm does return all generated datasets.) Instead the supported algorithm involves randomisation of both tolerance ε and distance ρ to the observed data (with the side issue that they may become negative since the noise is Laplace).

[A]ABC in Hawai’i

Posted in Statistics with tags , , , , , , , , on April 6, 2023 by xi'an

BayesComp²³ [aka MCMski⁶]

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , on March 20, 2023 by xi'an

The main BayesComp meeting started right after the ABC workshop and went on at a grueling pace, and offered a constant conundrum as to which of the four sessions to attend, the more when trying to enjoy some outdoor activity during the lunch breaks. My overall feeling is that it went on too fast, too quickly! Here are some quick and haphazard notes from some of the talks I attended, as for instance the practical parallelisation of an SMC algorithm by Adrien Corenflos, the advances made by Giacommo Zanella on using Bayesian asymptotics to assess robustness of Gibbs samplers to the dimension of the data (although with no assessment of the ensuing time requirements), a nice session on simulated annealing, from black holes to Alps (if the wrong mountain chain for Levi), and the central role of contrastive learning à la Geyer (1994) in the GAN talks of Veronika Rockova and Éric Moulines. Victor  Elvira delivered an enthusiastic talk on our massively recycled importance on-going project that we need to complete asap!

While their earlier arXived paper was on my reading list, I was quite excited by Nicolas Chopin’s (along with Mathieu Gerber) work on some quadrature stabilisation that is not QMC (but not too far either), with stratification over the unit cube (after a possible reparameterisation) requiring more evaluations, plus a sort of pulled-by-its-own-bootstrap control variate, but beating regular Monte Carlo in terms of convergence rate and practical precision (if accepting a large simulation budget from the start). A difficulty common to all (?) stratification proposals is that it does not readily applies to highly concentrated functions.

I chaired the lightning talks session, which were 3mn one-slide snapshots about some incoming posters selected by the scientific committee. While I appreciated the entry into the poster session, the more because it was quite crowded and busy, if full of interesting results, and enjoyed the slide solely made of “0.234”, I regret that not all poster presenters were not given the same opportunity (although I am unclear about which format would have permitted this) and that it did not attract more attendees as it took place in parallel with other sessions.

In a not-solely-ABC session, I appreciated Sirio Legramanti speaking on comparing different distance measures via Rademacher complexity, highlighting that some distances are not robust, incl. for instance some (all?) Wasserstein distances that are not defined for heavy tailed distributions like the Cauchy distribution. And using the mean as a summary statistic in such heavy tail settings comes as an issue, since the distance between simulated and observed means does not decrease in variance with the sample size, with the practical difficulty that the problem is hard to detect on real (misspecified) data since the true distribution behing (if any) is unknown. Would that imply that only intrinsic distances like maximum mean discrepancy or Kolmogorov-Smirnov are the only reasonable choices in misspecified settings?! While, in the ABC session, Jeremiah went back to this role of distances for generalised Bayesian inference, replacing likelihood by scoring rule, and requirement for Monte Carlo approximation (but is approximating an approximation that a terrible thing?!). I also discussed briefly with Alejandra Avalos on her use of pseudo-likelihoods in Ising models, which, while not the original model, is nonetheless a model and therefore to taken as such rather than as approximation.

I also enjoyed Gregor Kastner’s work on Bayesian prediction for a city (Milano) planning agent-based model relying on cell phone activities, which reminded me at a superficial level of a similar exploitation of cell usage in an attraction park in Singapore Steve Fienberg told me about during his last sabbatical in Paris.

In conclusion, an exciting meeting that should have stretched a whole week (or taken place in a less congenial environment!). The call for organising BayesComp 2025 is still open, by the way.

 

%d bloggers like this: