**I** was forwarded an article from Mare, the journal of the University of Leiden (Universiteit Leiden), a weekly newspaper written by an independent team of professional journalists. Entitled *“Fraude, verdwenen evaluaties en een verziekt klimaat: hoe de beste statistiekgroep van Nederland uiteenviel” (**Fraud, lost evaluations and a sickening climate: how the best statistics group in the Netherlands fell apart)*, it tells (through Google translate) the appalling story of how an investigation on mishandled student course evaluations led to the disintegration of the World-renowned Leiden statistics group, with the departure of a large fraction of its members, including its head, Aad van der Vaart, a giant in mathematical statistics, author of deep, reference, books like Asymptotic Statistics and Fundamentals of Nonparametric Bayesian Inference, an ERC advanced grant recipient, and now professor at TU Delft… While I am not at all acquainted with the specifics, reading the article makes the chain of events sound like chaos propagation, when the suspicious disappearance of student evaluation forms about a statistics course leads to a re-evaluation round, itself put under scrutiny by the University, then to a recruitment freeze of prospective statistician appointments by the (pure math) successor of Aad, as well as increasing harassment of the statisticians in the Mathematisch Instituut, and eventually to the exile of most of them. Wat een verspilling!

## Archive for mathematical statistics

## the mysterious disappearance of the Leiden statistics group

Posted in Books, pictures, Statistics, University life with tags Bayesian asymptotics, Bayesian nonparametrics, ERC, Leiden, mathematical statistics, the Netherlands, TU Delft, Universiteit Leiden on July 14, 2021 by xi'an## mathematical theory of Bayesian statistics [book review]

Posted in Books, Statistics, Travel, University life with tags Bayesian statistics, book reviews, CHANCE, mathematical statistics on May 6, 2021 by xi'an**I** came by chance (and not by CHANCE) upon this 2018 CRC Press book by Sumio Watanabe and ordered it myself to gather which material it really covered. As the back-cover blurb was not particularly clear and the title sounded quite general. After reading it, I found out that this is a mathematical treatise on some aspects of Bayesian information criteria, in particular on the Widely Applicable Information Criterion (WAIC) that was introduced by the author in 2010. The result is a rather technical and highly focussed book with little motivation or intuition surrounding the mathematical results, which may make the reading arduous for readers. Some background on mathematical statistics and Bayesian inference is clearly preferable and the book cannot be used as a textbook for most audiences, as opposed to eg An Introduction to Bayesian Analysis by J.K. Ghosh et al. or even more to Principles of Uncertainty by J. Kadane. In connection with this remark the exercises found in the book are closer to the delivery of additional material than to textbook-style exercises.

“posterior distributions are often far from any normal distribution, showing that Bayesian estimation gives the more accurate inference than other estimation methods.”

The overall setting is one where both the sampling and the prior distributions are different from respective “true” distributions. Requiring a tool to assess the discrepancy when utilising a specific pair of such distributions. Especially when the posterior distribution cannot be approximated by a Normal distribution. (Lindley’s paradox makes an interesting *incognito* incursion on p.238.) The WAIC is supported for the determination of the “true” model, in opposition to AIC and DIC, incl. on a mixture example that reminded me of our eight versions of DIC paper. In the “Basic Bayesian Theory” chapter (§3), the “basic theorem of Bayesian statistics” (p.85) states that the various losses related with WAIC can be expressed as second-order Taylor expansions of some cumulant generating functions, with order o(n⁻¹), “even if the posterior distribution cannot be approximated by any normal distribution” (p.87). With the intuition that

“if a log density ratio function has a relatively finite variance then the generalization loss, the cross validation loss, the training loss and WAIC have the same asymptotic behaviors.”

Obviously, these “basic” aspects should come as a surprise to a fair percentage of Bayesians (in the sense of not being particularly *basic*). Myself included. Chapter 4 exposes why, for regular models, the posterior distribution accumulates in an ε neighbourhood of the optimal parameter at a speed O(n^{2/5prior weights on said models.prior weights}). With the normalised partition fposterior probability ratiosunction being of order n^{-d/2} in the neighbourhood and exponentially negligible outside. A consequence of this regular asymptotic theory is that all above losses are asymptotically equivalent to the negative log likelihood plus similar order n⁻¹ terms that can be ordered. Chapters 5 and 6 deal with “standard” [the likelihood ratio is a multi-index power of the parameter ω] and general posterior distributions that can be written as mixtures of standard distributions, with expressions of the above losses in terms of new universal constants. Again, a rather remote concern of mine. The book also includes a chapter (§7) on MCMC, with a rather involved proof that a Metropolis algorithm satisfies detailed balance (p.210). The Gibbs sampling section contains an extensive example on a two-dimensional two-component unit-variance Normal mixture, with an unusual perspective on the posterior, which is considered as “singular” when the true means are close. (Label switching or the absence thereof is not mentioned.) In terms of approximating the normalising constant (or free energy), the only method discussed there is path sampling, with a cryptic remark about harmonic mean estimators (not identified as such). In a final knapsack chapter (§9), Bayes factors (confusedly denoted as L(x)) are shown to be most powerful tests in a Bayesian sense when comparing hypotheses without prior weights on said hypotheses, while posterior probability ratios are the natural statistics for comparing models with prior weights on said models. (With Lindley’s paradox making another appearance, still *incognito*!) And a notion of *phase transition* for hyperparameters is introduced, with the meaning of a radical change of behaviour at a critical value of said hyperparameter. For instance, for a simple normal- mixture outlier model, the critical value of the Beta hyperparameter is α=2. Which is a wee bit of a surprise when considering Rousseau and Mengersen (2011) since their bound for consistency was α=d/2.

In conclusion, this is quite an original perspective on Bayesian models, covering the somewhat unusual (and potentially controversial) issue of misspecified priors and centered on the use of information criteria. I find the book could have benefited from further editing as I noticed many typos and somewhat unusual sentences (at least unusual to me).

*[Disclaimer about potential self-plagiarism: this post or an edited version should eventually appear in my Books Review section in CHANCE.]*

## factorisation theorem on densities

Posted in Statistics with tags cross validated, dominating measure, exponential families, factorisation, final exam, mathematical statistics, sufficient statistics on December 23, 2020 by xi'anAnother occurrence, while building my final math stat exam for my (quarantined!) third year students, of a question on X validated that led me to write down more precisely an argument for the decomposition of densities in exponential families. Albeit the decomposition is somewhat moot *(and lost on the initiator of the question since this person later posted an answer ignoring measures)*, as it all depends on the choice of the dominating measures over X, T(X), and the slices {x; T(x)=t}. The fact that the slice does depend on t requires the measure to accept a potential dependence on t, in which case the conditional density wrt this measure can as well be constant.

## sans sérif & sans chevron

Posted in Books, R, Statistics, University life with tags bicycle, book publishing, chevron, final exam, LaTeX, mathematical statistics, multiple answer test, R, R code, sans-sérif, simulation, vélo, Zoom on June 17, 2020 by xi'an{\sf df=function(x)2*pi*x-4*(x>1)*acos(1/(x+(1-x)*(x<1)))}

**A**s I was LaTeXing a remote exam for next week, including some R code questions, I came across the apparent impossibility to use < and > symbols in the sans-sérif “\sf” font… Which is a surprise, given the ubiquity of the symbols in R and my LaTeXing books over the years. Must have always used “\tt” and “\verb” then! On the side, I tried to work with the automultiplechoice LaTeX package [which should be renamed velomultiplechoice!] of Alexis Bienvenüe, which proved a bit of a challenge as the downloadable version contained a flawed file of automultiplechoice.sty! Still managed to produce a 400 question exam with random permutations of questions and potential answers. But not looking forward the 4 or 5 hours of delivering the test on Zoom…

## PhD position for research in ABC in Chalmers University

Posted in Statistics with tags ABC, Approximate Bayesian computation, Chalmers University, Gothenburg, likelihood-free methods, mathematical statistics, PhD position, simulation-based inference, Sweden, vacancy on May 27, 2020 by xi'an*[Posting a call for PhD candidates from Umberto Piccini as the deadline is June 1, next Monday!]*

A PhD student position in mathematical statistics on simulation-based inference methods for models with an “intractable” likelihood is available at the Dept. Mathematical Sciences, Chalmers University, Gothenburg (Sweden).

You will be part of an international collaboration to create new methodology bridging between simulation-based inference (such as approximate Bayesian computation and other likelihood-free methods) and deep neuronal networks. The goal is to ease inference for stochastic modelling.

Details on the project and the essential requirements are at https://www.chalmers.se/en/departments/math/research/research-groups/AIMS/Pages/ai-project-5.aspx

The PhD student position is fully funded and is up to 5 years, in the dynamic and international city of Gothenburg, the second largest city in Sweden, https://www.goteborg.com/en/ As a PhD student in Mathematical Sciences you will have opportunities for many inspiring conversations, a lot of autonomous work and some travel.

The position will be supervised by Assoc. Prof. Umberto Picchini.

Apply by **01 June 2020** following the instructions at

https://www.chalmers.se/en/about-chalmers/Working-at-Chalmers/Vacancies/Pages/default.aspx?rmpage=job&rmjob=8556

For informal enquiries, please get in touch with Umberto Picchini