Archive for CREST

Introduction to Sequential Monte Carlo [book review]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , on June 8, 2021 by xi'an

[Warning: Due to many CoI, from Nicolas being a former PhD student of mine, to his being a current colleague at CREST, to Omiros being co-deputy-editor for Biometrika, this review will not be part of my CHANCE book reviews.]

My friends Nicolas Chopin and Omiros Papaspiliopoulos wrote in 2020 An Introduction to Sequential Monte Carlo (Springer) that took several years to achieve and which I find remarkably coherent in its unified presentation. Particles filters and more broadly sequential Monte Carlo have expended considerably in the last 25 years and I find it difficult to keep track of the main advances given the expansive and heterogeneous literature. The book is also quite careful in its mathematical treatment of the concepts and, while the Feynman-Kac formalism is somewhat scary, it provides a careful introduction to the sampling techniques relating to state-space models and to their asymptotic validation. As an introduction it does not go to the same depths as Pierre Del Moral’s 2004 book or our 2005 book (Cappé et al.). But it also proposes a unified treatment of the most recent developments, including SMC² and ABC-SMC. There is even a chapter on sequential quasi-Monte Carlo, naturally connected to Mathieu Gerber’s and Nicolas Chopin’s 2015 Read Paper. Another significant feature is the articulation of the practical part around a massive Python package called particles [what else?!]. While the book is intended as a textbook, and has been used as such at ENSAE and in other places, there are only a few exercises per chapter and they are not necessarily manageable (as Exercise 7.1, the unique exercise for the very short Chapter 7.) The style is highly pedagogical, take for instance Chapter 10 on the various particle filters, with a detailed and separate analysis of the input, algorithm, and output of each of these. Examples are only strategically used when comparing methods or illustrating convergence. While the MCMC chapter (Chapter 15) is surprisingly small, it is actually an introducing of the massive chapter on particle MCMC (and a teaser for an incoming Papaspiloulos, Roberts and Tweedie, a slow-cooking dish that has now been baking for quite a while!).

approximate Bayesian inference [survey]

Posted in Statistics with tags , , , , , , , , , , , , , , , , , , on May 3, 2021 by xi'an

In connection with the special issue of Entropy I mentioned a while ago, Pierre Alquier (formerly of CREST) has written an introduction to the topic of approximate Bayesian inference that is worth advertising (and freely-available as well). Its reference list is particularly relevant. (The deadline for submissions is 21 June,)

[de]quarantined by slideshare

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , on January 11, 2021 by xi'an

A follow-up episode to the SlideShare m’a tuer [sic] saga: After the 20 November closure of my xianblog account and my request for an explanation, I was told by Linkedin that a complaint has been made about one of my talks for violation of copyright. Most surprisingly, at least at first, it was about the slides for the graduate lectures I gave ten years ago at CREST on (re)reading Jaynes’ Probability Theory. While the slides contain a lot of short quotes from the Logic of Science, somewhat necessarily since I discuss the said book, there are also many quotes from Jeffreys’ Theory of Probability and “t’is but a scratch” on the contents of this lengthy book… Plus, the pdf file appears to be accessible on several sites, including one with an INRIA domain. Since I had to fill a “Counter-Notice of Copyright Infringement” to unlock the rest of the depository, I just hope no legal action is going to be taken about this lecture. But I remain puzzled at the reasoning behind the complaint, unwilling to blame radical Jaynesians for it! As an aside, here are the registered 736 views of the slides for the past year:

democracy suffers when government statistics fail [review of a book review]

Posted in Books, Statistics, Travel with tags , , , , , , , , , , on October 13, 2020 by xi'an

This week, rather extraordinarily!, Nature book review was about official statistics, with a review of Julia Lane’s Democratizing our Data. (The democratizing in the title is painful to watch, though!) The reviewer is Beth Simone Noveck, who was deputy chief technology officer under Barack Obama and a major researcher in digital democracy, excusez du peu! (By comparison, Trump’s deputy chief technology officer had a B.A. in politics and no other qualification for the job, but got nonetheless promoted to chief…)

“Lane asserts that the United States is failing to adequately track its population, economy and society. Agencies are stagnating. The census dramatically undercounts people from minority racial groups. There is no complete national list of households. The data are made available two years after the count, making them out of date as the basis for effective policy making.” B.S. Noveck

The debate raised by the book on the ability of official statistics to keep track of people in a timely manner is most interesting. And not limited to the USA, even though it seems to fit in a Hell of its own:

“In the United States, there is no single national statistical agency. The process of gathering and publishing public data is fragmented across multiple departments and agencies, making it difficult to introduce new ideas across the whole enterprise. Each agency is funded by, and accountable to, a different congressional committee. Congress once sued the commerce department for attempting to introduce modern techniques of statistical sampling to shore up a flawed census process that involves counting every person by hand.” B.S. Noveck

This remark brings back to (my) mind the titanesque debates of the 1990s when Republicans attacked sampling techniques and statisticians like Steve Fienberg rose to their defence. (Although others like David Freedman opposed the move, paradoxically mistrusting statistics!) The French official statistic institute, INSEE, has been running sampled census(es) for decades now, without the national representation going up in arms. I am certainly being partial, having been associated with INSEE, its statistics school ENSAE and its research branch CREST since 1982, but it seems to me that the hiring of highly skilled and thoroughly trained civil servants by this institute helps in making the statistics it produces more trustworthy and efficient, including measuring the impact of public policies. (Even though accusations of delay and bias show up regularly.) And in making the institute more prone to adopt new methods, thanks to the rotation of its agents. (B.S. Noveck notices and deplores the absence of reference to foreign agencies in the book.)

“By contrast, the best private-sector companies produce data that are in real time, comprehensive, relevant, accessible and meaningful.”  B.S. Noveck

However, the notion in the review (and the book?) that private companies are necessarily doing better is harder to buy, if an easy jab at a public institution. Indeed, public official statistic institutes are the only one to have access to data covering the entire population, either directly or through other public institutes, like the IRS or social security claims. And trusting the few companies with a similar reach is beyond naïve (even though a company like Amazon has almost an instantaneous and highly local sensor of economic and social conditions!). And at odds for the call of democratizing, as shown by the impact of some of these companies on the US elections.

homeless hosted in my former office

Posted in pictures, Travel with tags , , , , , , , , , on September 24, 2020 by xi'an