## introduction à la Statistique, by Cédric Villani

On Tuesday, there was a series of talks (in French) celebrating Statistics, with an introduction by Cédric Villani. (The talks are reproduced on the French Statistical Society (SFDS) webpage.) Rather unpredictably (!), Villani starts from an early 20th Century physics experiment leading to the estimation of the Avogadro constant from a series of integers. (Repeating an earlier confusion of his, he substitutes the probability of observing a rare event under the null with the probability of the alternative on the Higgs boson to be true!) A special mention to/of Francis Galton’s “supreme law of unreason”. And of surveys, pointing out the wide variability of a result for standard survey populations. But missing the averaging and more statistical effect of accumulating surveys, a principle at the core of Nate Silver‘s predictions. A few words again about the Séralini et al. experiments on Monsanto genetically modified maize NK603, attacked for their lack of statistical foundations. And then, hear hear!, much more than a mere mention of phylogenetic inference, with explanations about inverse inference, Markov Chain Monte Carlo algorithms on trees, convergence of Metropolis algorithms by Persi Diaconis, and Bayesian computations! Of course, this could be seen more as numerical probability than as truly statistics, but it is still pleasant to hear.

The last part of the talk more predictably links Villani’s own field of optimal transportation (which I would translate as a copula problem…) and statistics, mostly understood as empirical distributions. I find it somewhat funny that Sanov’s theorem is deemed therein to be a (or even the) Statistics theorem! I wonder how many statisticians could state this theorem… The same remark applies for the Donsker-Varadhan theory of large deviations. Still, the very final inequality linking the three types of information concepts is just… beautiful! You may spot in the last minute a micro confusion in repeating twice the definition for Fisher’s information rather than deducing that the information associated with a location family is constant. (And a no-so-necessary mention of the Cramer-Rao bound on unbiased estimators. Which could have been quoted as the Fréchet-Darmois-Cramer-Rao bound in such historical grounds ) A pleasant moment, all in all! (There are five other talks on that page, including one by Emmanuel Candés.)

This site uses Akismet to reduce spam. Learn how your comment data is processed.