Archive for ENSAE

analyse des données

Posted in Books, Kids with tags , , , , , on June 24, 2021 by xi'an

Introduction to Sequential Monte Carlo [book review]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , on June 8, 2021 by xi'an

[Warning: Due to many CoI, from Nicolas being a former PhD student of mine, to his being a current colleague at CREST, to Omiros being co-deputy-editor for Biometrika, this review will not be part of my CHANCE book reviews.]

My friends Nicolas Chopin and Omiros Papaspiliopoulos wrote in 2020 An Introduction to Sequential Monte Carlo (Springer) that took several years to achieve and which I find remarkably coherent in its unified presentation. Particles filters and more broadly sequential Monte Carlo have expended considerably in the last 25 years and I find it difficult to keep track of the main advances given the expansive and heterogeneous literature. The book is also quite careful in its mathematical treatment of the concepts and, while the Feynman-Kac formalism is somewhat scary, it provides a careful introduction to the sampling techniques relating to state-space models and to their asymptotic validation. As an introduction it does not go to the same depths as Pierre Del Moral’s 2004 book or our 2005 book (Cappé et al.). But it also proposes a unified treatment of the most recent developments, including SMC² and ABC-SMC. There is even a chapter on sequential quasi-Monte Carlo, naturally connected to Mathieu Gerber’s and Nicolas Chopin’s 2015 Read Paper. Another significant feature is the articulation of the practical part around a massive Python package called particles [what else?!]. While the book is intended as a textbook, and has been used as such at ENSAE and in other places, there are only a few exercises per chapter and they are not necessarily manageable (as Exercise 7.1, the unique exercise for the very short Chapter 7.) The style is highly pedagogical, take for instance Chapter 10 on the various particle filters, with a detailed and separate analysis of the input, algorithm, and output of each of these. Examples are only strategically used when comparing methods or illustrating convergence. While the MCMC chapter (Chapter 15) is surprisingly small, it is actually an introducing of the massive chapter on particle MCMC (and a teaser for an incoming Papaspiloulos, Roberts and Tweedie, a slow-cooking dish that has now been baking for quite a while!).

Paris-Saclay campus debated in Nature

Posted in Books, University life with tags , , , , , , , , , , , on November 18, 2020 by xi'an

The newly created entity of the Paris-Saclay University is featuring in two editorials of Nature of 03 November for reaching a high ranking in one of the many league tables purportedly summarising the academic achievements of universities by a single number. This entity is made of the much older Université d’Orsay and of aggregated research institutes like Ecole Normale (formerly) de Cachan, Institut d’Optique, or Centrale Supélec, incidentally and uninterestingly located nearby my home. As an aggregate of high quality institutions, it is thus little surprise that it achieves a sufficient critical mass to reach a high ranking. Were the nearby Institut Polytechnique de Paris integrated as well, the ranking would have been even higher. (Why the two adjacent campuses did not merge defies rationality, but can be explained by politics and the long-standing opposition between Universités and Grandes Écoles in the French academic landscape.) I thus think the Nature editorial about the dangers to “the well-being of those on the academic front line” brought by the quest for high rankings is missing the point. By a fair margin. Indeed, it mixes the financial and institutional efforts made by [former president] Nicolas Sarkozy in creating a single campus with the funding of this mostly pre-existing campus [and in dire need of renovations, as exemplified by the new math department]. And seems to see the more competitive grant system in France connected with this creation when the [somewhat controversial] Agence Nationale de la Recherche in charge of the public-funded grants has been around since 2005. And I find that the unceasingly growing mille-feuille of aggregates, conglomerates, unions, initiatives, &tc. happening in the French academic landscape [like Paris Dauphine joining PSL a few years ago, whose status was confirmed today] are both blurring the picture and reducing the efficiency of the maneuvers by multiplying the administrative structures without creating a sense of belonging to a common institution. Plus ça change…

democracy suffers when government statistics fail [review of a book review]

Posted in Books, Statistics, Travel with tags , , , , , , , , , , on October 13, 2020 by xi'an

This week, rather extraordinarily!, Nature book review was about official statistics, with a review of Julia Lane’s Democratizing our Data. (The democratizing in the title is painful to watch, though!) The reviewer is Beth Simone Noveck, who was deputy chief technology officer under Barack Obama and a major researcher in digital democracy, excusez du peu! (By comparison, Trump’s deputy chief technology officer had a B.A. in politics and no other qualification for the job, but got nonetheless promoted to chief…)

“Lane asserts that the United States is failing to adequately track its population, economy and society. Agencies are stagnating. The census dramatically undercounts people from minority racial groups. There is no complete national list of households. The data are made available two years after the count, making them out of date as the basis for effective policy making.” B.S. Noveck

The debate raised by the book on the ability of official statistics to keep track of people in a timely manner is most interesting. And not limited to the USA, even though it seems to fit in a Hell of its own:

“In the United States, there is no single national statistical agency. The process of gathering and publishing public data is fragmented across multiple departments and agencies, making it difficult to introduce new ideas across the whole enterprise. Each agency is funded by, and accountable to, a different congressional committee. Congress once sued the commerce department for attempting to introduce modern techniques of statistical sampling to shore up a flawed census process that involves counting every person by hand.” B.S. Noveck

This remark brings back to (my) mind the titanesque debates of the 1990s when Republicans attacked sampling techniques and statisticians like Steve Fienberg rose to their defence. (Although others like David Freedman opposed the move, paradoxically mistrusting statistics!) The French official statistic institute, INSEE, has been running sampled census(es) for decades now, without the national representation going up in arms. I am certainly being partial, having been associated with INSEE, its statistics school ENSAE and its research branch CREST since 1982, but it seems to me that the hiring of highly skilled and thoroughly trained civil servants by this institute helps in making the statistics it produces more trustworthy and efficient, including measuring the impact of public policies. (Even though accusations of delay and bias show up regularly.) And in making the institute more prone to adopt new methods, thanks to the rotation of its agents. (B.S. Noveck notices and deplores the absence of reference to foreign agencies in the book.)

“By contrast, the best private-sector companies produce data that are in real time, comprehensive, relevant, accessible and meaningful.”  B.S. Noveck

However, the notion in the review (and the book?) that private companies are necessarily doing better is harder to buy, if an easy jab at a public institution. Indeed, public official statistic institutes are the only one to have access to data covering the entire population, either directly or through other public institutes, like the IRS or social security claims. And trusting the few companies with a similar reach is beyond naïve (even though a company like Amazon has almost an instantaneous and highly local sensor of economic and social conditions!). And at odds for the call of democratizing, as shown by the impact of some of these companies on the US elections.

homeless hosted in my former office

Posted in pictures, Travel with tags , , , , , , , , , on September 24, 2020 by xi'an