**R**ecently, I have been contacted by a mainstream statistics journal to write a review of Rao-Blackwellisation techniques in computational statistics, in connection with an issue celebrating C.R. Rao’s 100th birthday. As many many techniques can be interpreted as weak forms of Rao-Blackwellisation, as e.g. all auxiliary variable approaches, I am clearly facing an abundance of riches and would thus welcome suggestions from Og’s readers on the major advances in Monte Carlo methods that can be connected with the Rao-Blackwell-Kolmogorov theorem. (On the personal and anecdotal side, I only met C.R. Rao once, in 1988, when he came for a seminar at Purdue University where I was spending the year.)

## Archive for Monte Carlo methods

## Rao-Blackwellisation, a review in the making

Posted in Statistics with tags Andrei Kolmogorov, birthday, C.R. Rao, computational statistics, David Blackwell, Monte Carlo methods, Purdue University, Rao-Blackwell theorem, Rao-Blackwellisation, review, survey on March 17, 2020 by xi'an## MCqMC2020 key dates

Posted in pictures, Statistics, Travel, University life with tags Britain, Eagle and Child, MCMC2020, Monte Carlo methods, Oxford, Oxford Mathematics, quasi-Monte Carlo methods, Saint Giles cemetery, scientific computing, University of Oxford on January 23, 2020 by xi'an**A** reminder of the key dates for the incoming MCqMC2020 conference this summer in Oxford:

Feb 28, Special sessions/minisymposia submission

Mar 13, Contributed abstracts submission

Mar 27, Acceptance notification

Mar 27, Registration starts

May 8, End of early bird registration

June 12, Speaker registration deadline

Aug 9-14 Conference

and of the list of plenary speakers

Yves Atchadé (Boston University)

Jing Dong (Columbia University)

Pierre L’Ecuyer (Université de Montreal)

Mark Jerrum (Queen Mary University London)

Gerhard Larcher (JKU Linz)

Thomas Muller (NVIDIA)

David Pfau (Google DeepMind)

Claudia Schillings (University of Mannheim)

Mario Ullrich (JKU Linz)

## Couplings and Monte Carlo [advanced graduate course at Dauphine by Pierre Jacob]

Posted in Kids, pictures, Statistics, Travel with tags coupling, graduate course, Monte Carlo methods, optimal transport, Paris, Pierre Jacob, Université Paris Dauphine on January 20, 2020 by xi'an**A**s a visiting professor at Paris-Dauphine next month, Pierre Jacob will give a series of lectures on coupling and Monte Carlo. Next month on Feb. 13, 14, 25 27, at Université Paris-Dauphine, the first two starting at 8:30 (room E) and the last two starting at 13:45 (room F and D201, respectively). Attendance is open to all and material will be made available on the lecture webpage.

## AABI9 tidbits [& misbits]

Posted in Books, Mountains, pictures, Statistics, Travel, University life with tags British Columbia, decompression, generative model, image rendering, Langevin MCMC algorithm, maximum entropy, Monte Carlo methods, NeurIPS 2019, optimisation, Python, round table, stochastic gradient, Vancouver, variational Bayes methods on December 10, 2019 by xi'an**T**oday’s Advances in Approximate Bayesian Inference symposium, organised by Thang Bui, Adji Bousso Dieng, Dawen Liang, Francisco Ruiz, and Cheng Zhang, took place in front of Vancouver Harbour (and the tentalising ski slope at the back) and saw more than 400 participants, drifting away from the earlier versions which had a stronger dose of ABC and much fewer participants. There were students’ talks in a fair proportion, as well (and a massive number of posters). As of below, I took some notes during some of the talks with no pretense at exhaustivity, objectivity or accuracy. (This is a blog post, remember?!) Overall I found the day exciting (to the point I did not suffer at all from the usal naps consecutive to very short nights!) and engaging, with a lot of notions and methods I had never heard about. (Which shows how much I know nothing!)

The first talk was by Michalis Titsias, *Gradient-based Adaptive Markov Chain Monte Carlo* (jointly with Petros Dellaportas) involving as its objective function the multiplication of the variance of the move and of the acceptance probability, with a proposed adaptive version merging gradients, variational Bayes, neurons, and two levels of calibration parameters. The method advocates using this construction in a burnin phase rather than continuously, hence does not require advanced Markov tools for convergence assessment. (I found myself less excited by adaptation than earlier, maybe because it seems like switching one convergence problem for another, with additional design choices to be made.)The second talk was by Jakub Swiatkowsk, *The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks*, involving mean field approximation in variational inference (loads of VI at this symposium!), meaning *de facto* searching for a MAP estimator, and reminding me of older factor analysis and other *analyse de données* projection methods, except it also involved neural networks (what else at NeurIPS?!)The third talk was by Michael Gutmann, *Robust Optimisation Monte Carlo*, (OMC) for implicit data generated models (Diggle & Graton, 1982), an ABC talk at last!, using a formalisation through the functional representation of the generative process and involving derivatives of the summary statistic against parameter, in that sense, with the (Bayesian) random nature of the parameter sample only induced by the (frequentist) randomness in the generative transform since a new parameter “realisation” is obtained there as the one providing minimal distance between data and pseudo-data, with no uncertainty or impact of the prior. The Jacobian of this summary transform (and once again a neural network is used to construct the summary) appears in the importance weight, leading to OMC being unstable, beyond failing to reproduce the variability expressed by the regular posterior or even the ABC posterior. It took me a while to wonder `where is Wally?!’ (the prior) as it only appears in the importance weight.

The fourth talk was by Sergey Levine, *Reinforcement Learning, Optimal , Control, and Probabilistic Inference*, back to Kullback-Leibler as the objective function, with linkage to optimal control (with distributions as actions?), plus again variational inference, producing an approximation in sequential settings. This sounded like a type of return of the MaxEnt prior, but the talk pace was so intense that I could not follow where the innovations stood.

The fifth talk was by Iuliia Molchanova, on *Structured Semi-Implicit Variational Inference*, from BAyesgroup.ru (I did not know of a Bayesian group in Russia!, as I was under the impression that Bayesian statistics were under-represented there, but apparently the situation is quite different in machine learning.) The talk brought an interesting concept of semi-implicit variational inference, exploiting some form of latent variables as far as I can understand, using mixtures of Gaussians.

The sixth talk was by Rianne van den Berg, *Normalizing Flows for Discrete Data,* and amounted to covering three papers also discussed in NeurIPS 2019 proper, which I found somewhat of a suboptimal approach to an invited talk, as it turned into a teaser for following talks or posters. But the teasers it contained were quite interesting as they covered normalising flows as integer valued controlled changes of variables using neural networks about which I had just became aware during the poster session, in connection with papers of Papamakarios et al., which I need to soon read.

The seventh talk was by Matthew Hoffman: *Langevin Dynamics as Nonparametric Variational Inference*, and sounded most interesting, both from title and later reports, as it was bridging Langevin with VI, but I alas missed it for being “stuck” in a tea-house ceremony that lasted much longer than expected. (More later on that side issue!)

After the second poster session (with a highly original proposal by Radford Neal towards creating non-reversibility at the level of the uniform generator rather than later on), I thus only attended Emily Fox’s *Stochastic Gradient MCMC for Sequential Data Sources, *which superbly reviewed (in connection with a sequence of papers, including a recent one by Aicher et al.) error rate and convergence properties of stochastic gradient estimator methods there. Another paper I need to soon read!

The one before last speaker, Roman Novak, exposed a Python library about infinite neural networks, for which I had no direct connection (and talks I have always difficulties about libraries, even without a four hour sleep night) and the symposium concluded with a mild round-table. Mild because Frank Wood’s best efforts (and healthy skepticism about round tables!) to initiate controversies, we could not see much to bite from each other’s viewpoint.

## what if what???

Posted in Books, Statistics with tags Markov chain Monte Carlo algorithm, MCMC, Monte Carlo integration, Monte Carlo methods, what if?, wikipedia on October 7, 2019 by xi'an*[Here is a section of the Wikipedia page on Monte Carlo methods which makes little sense to me. What if it was not part of this page?!]*

## Monte Carlo simulation versus “what if” scenarios

There are ways of using probabilities that are definitely not Monte Carlo simulations – for example, deterministic modeling using single-point estimates. Each uncertain variable within a model is assigned a “best guess” estimate. Scenarios (such as best, worst, or most likely case) for each input variable are chosen and the results recorded.

^{[55]}By contrast, Monte Carlo simulations sample from a probability distribution for each variable to produce hundreds or thousands of possible outcomes. The results are analyzed to get probabilities of different outcomes occurring.

^{[56]}For example, a comparison of a spreadsheet cost construction model run using traditional “what if” scenarios, and then running the comparison again with Monte Carlo simulation and triangular probability distributions shows that the Monte Carlo analysis has a narrower range than the “what if” analysis. This is because the “what if” analysis gives equal weight to all scenarios (see quantifying uncertainty in corporate finance), while the Monte Carlo method hardly samples in the very low probability regions. The samples in such regions are called “rare events”.

## call for sessions and labs at Bay2sC0mp²⁰

Posted in pictures, R, Statistics, Travel, University life with tags BayesComp, call for contributions, Florida, Gainesville, ISBA, last call, machine learning, MCMC, MCMSki, Monte Carlo methods, poster session, R, software, STAN, statistical language, University of Florida, untractable normalizing constant, USA on February 22, 2019 by xi'an**A** call to all potential participants to the incoming BayesComp 2020 conference at the University of Florida in Gainesville, Florida, 7-10 January 2020, to submit proposals [to me] for contributed sessions on everything computational *or* training labs [to David Rossell] on a specific language or software. The deadline is **April 1** and the sessions will be selected by the scientific committee, other proposals being offered the possibility to present the associated research during a poster session [which always is a lively component of the conference]. (Conversely, we reserve the possibility of a “last call” session made from particularly exciting posters on new topics.) Plenary speakers for this conference are

- David Blei (Columbia University)
- Paul Fearnhead (University of Lancaster)
- Emily Fox (University of Washington)
- Max Welling (University of Amsterdam)

and the first invited sessions are already posted on the webpage of the conference. We dearly hope to attract a wide area of research interests into a as diverse as possible program, so please accept this invitation!!!

## independent random sampling methods [book review]

Posted in Books, Statistics, University life with tags book review, inverse cdf, MCMC, Monte Carlo methods, Monte Carlo Statistical Methods, multiple try Metropolis, Non-Uniform Random Variate Generation, PRNG, random number generation, ratio of uniform algorithm, simulation, Springer-Verlag, Universidad Carlos III de Madrid, vertical density representation on May 16, 2018 by xi'an**L**ast week, I had the pleasant surprise to receive a copy of this book in the mail. Book that I was not aware had been written or published (meaning that I was not involved in its review!). The three authors, Luca Martino, David Luengo, and Joaquín Míguez, of Independent Random Sampling Methods are from Madrid universities and I have read (and posted on) several of their papers on (population) Monte Carlo simulation in the recent years. Including Luca’s survey of multiple try MCMC which was helpful in writing our WIREs own survey.

The book is a pedagogical coverage of most algorithms used to simulate independent samples from a given distribution, which of course recoups some of the techniques exposed with more details by [another] Luc, namely Luc Devroye’s Non-uniform random variate generation bible, often mentioned here (and studied in uttermost details by a dedicated reading group in Warwick). It includes a whole chapter on accept-reject methods, with in particular a section on Payne-Dagpunar’s band rejection I had not seen previously. And another entire chapter on ratio-of-uniforms techniques. On which the three authors had proposed generalisations [covered by the book], years before I attempted to go the same way, having completely forgotten reading their paper at the time… Or the much earlier 1991 paper by Jon Wakefield, Alan Gelfand and Adrian Smith!

The book also covers the “vertical density representation”, due to Troutt (1991), which consists in considering the distribution of the density p(.) of the random variable X as a random variable, p(X). I remember pondering about this alternative to the cdf transform and giving up on it as the outcome has a distribution depending on p, even when the density is monotonous. Even though I am not certain from reading the section that this is particularly appealing…

Given its title, the book contains very little about MCMC. Except for a last and final chapter that covers adaptive independent Metropolis-Hastings algorithms, in connection with some of the authors’ recent work. Like multiple try Metropolis. Relating to the (unidimensional) ARMS “ancestor” of adaptive MCMC methods. (As noted in a recent blog on Holden et al., 2009 , I have trouble understanding how recycling only rejected proposed values to build a better proposal distribution is enough to guarantee convergence of an adaptive algorithm, but the book does not delve much into this convergence.)

All in all and with the bias induced by me working in the very area, I find the book quite a nice entry on the topic, which can be used in a Monte Carlo course at both undergraduate and graduate levels if one want to avoid going into Markov chains. It is certainly less likely to scare students away than the comprehensive Non-uniform random variate generation and on the opposite may induce some of them to pursue a research career in this domain.