**T**his afternoon I went to CREST to empty my office there from books and a few papers (like the original manuscript version of Monte Carlo Statistical Methods). This is because the research centre, along with the ENSAE graduate school (my Alma mater), is moving to a new building on the Saclay plateau, next to École Polytechnique. As part of this ambitious migration of engineering schools from downtown Paris to a brand new campus there. Without getting sentimental about this move, it means leaving the INSEE building in Malakoff, on the outskirts of downtown Paris, which has been an enjoyable part of my student and then academic life from 1982 till now. And also leaving the INSEE Paris Club runners! (I am quite uncertain about being as active at the new location, if only because going there by bike is a bit more of a challenge. To be addressed anyway!) And I left behind my accumulation of conference badges (although I should try to recycle them for the incoming BNP 11 in Paris!).

## Archive for ENSAE

## end of a long era [1982-2017]

Posted in Books, pictures, Running, University life with tags École Polytechnique, badge, boxes, CREST, ENSAE, INSEE, Insee Paris Club, Malakoff, moving, office, Paris, Paris-Saclay campus on May 23, 2017 by xi'an## Alan Gelfand in Paris

Posted in pictures, Statistics, Travel, University life with tags Alan Gelfand, BiPS, CREST, Duke University, ENSAE, Gaussian processes, Paris, seminar on May 11, 2017 by xi'anAlan Gelfand (Duke University) will be in Paris on the week of May 15 and give several seminars, including one at AgroParisTech on May 16:

and on at CREST (BiPS) on May 18, 2pm:

Scalable Gaussian processes for analyzing space and space-time datasets

## the French MIT? not so fast…

Posted in Kids, pictures, University life with tags École Polytechnique, ENSAE, French politics, French universities, Orsay, Paris-Saclay campus, PSL, Université Paris Dauphine on February 20, 2017 by xi'an**A** news report last weekend on Nature webpage about the new science super-campus south of Paris connected with my impressions of the whole endeavour: the annual report from the Court of Auditors estimated that the 5 billion euros invested in this construct were not exactly a clever use of public [French taxpayer] money! This notion to bring a large number of [State] engineer and scientific schools from downtown Paris to the plateau of Saclay, about 25km south-west of Paris, around École Polytechnique, had some appeal, since these were and are prestigious institutions, most with highly selective entry exams, and with similar training programs, now that they have almost completely lost the specialisation that justified their parallel existences! And since a genuine university, Paris 11 Orsay, stood nearby at the bottom of the plateau. Plus, a host of startups and research branches of companies. Hence the concept of a French MIT.

However, as so often the case in Jacobin France, the move has been decided and supported by the State “top-down” rather than by the original institutions themselves. Including a big push by Nicolas Sarkozy in 2010. While the campus can be reached by public transportation like RER, the appeal of living and working on the campus is obviously less appealing to both students and staff than in a listed building in the centre of Paris. Especially when lodging and living infrastructures are yet to be completed. But the main issue is that the fragmentation of those schools, labs and institutes, in terms of leadership, recruiting, research, and leadership, has not been solved by the move, each entity remaining strongly attached to its identity, degree, networks, &tc., and definitely unwilling to merge into a super-university with a more efficient organisation of teaching and research. Which means the overall structure as such is close to invisible at the international level. This is the point raised by the State auditors. And perceived by the State which threatens to cut funding at this late stage!

This is not the only example within French higher educations institutions since most have been forced to merged into incomprehensible super-units under the same financial threat. Like Paris-Dauphine being now part of the PSL (*Paris Sciences et Lettres*) heterogeneous conglomerate. (I suspect one of the primary reasons for this push by central authorities was to create larger entities towards moving up in the international university rankings, which is absurd for many reasons, from the limited worth of such rankings, to the lag between the creation of a new entity and the appearance on an international university ranking, to the difficulty in ranking researchers from such institutions: in Paris-Dauphine, the address to put on papers is more than a line long, with half a dozen acronyms!)

## zig, zag, and subsampling

Posted in Books, Statistics, University life with tags BiPS, CREST, ENSAE, Malakoff, MCMC, Monte Carlo Statistical Methods, Paris, Université Paris Dauphine, University of Warwick, Zig-Zag on December 29, 2016 by xi'an**T**oday, I alas missed a seminar at BiPS on the Zig-Zag (sub-)sampler of Joris Bierkens, Paul Fearnhead and Gareth Roberts, presented here in Paris by James Ridgway. Fortunately for me, I had some discussions with Murray Pollock in Warwick and then again with Changye Wu in Dauphine that shed some light on this complex but highly innovative approach to simulating in Big Data settings thanks to a correct subsampling mechanism.

The zig-zag process runs a continuous process made of segments that turn from one diagonal to the next at random times driven by a generator connected with the components of the gradient of the target log-density. Plus a symmetric term. Provided those random times can be generated, this process is truly available and associated with the right target distribution. When the components of the parameter are independent (an unlikely setting), those random times can be associated with an inhomogeneous Poisson process. In the general case, one needs to bound the gradients by more manageable functions that create a Poisson process that can later be thinned. Next, one needs to simulate the process for the upper bound, a task that seems hard to achieve apart from linear and piecewise constant upper bounds. The process has a bit of a slice sampling taste, except that it cannot be used as a slice sampler but requires continuous time integration, given that the length of each segment matters. (Or maybe random time subsampling?)

A highly innovative part of the paper concentrates on Big Data likelihoods and on the possibility to subsample properly and exactly the original dataset. The authors propose Zig-Zag with subsampling by turning the gradients into random parts of the gradients. While remaining unbiased. There may be a cost associated with this gain of one to *n*, namely that the upper bounds may turn larger as they handle all elements in the likelihood at once, hence become (even) less efficient. (I am more uncertain about the case of the control variates, as it relies on a Lipschitz assumption.) While I still miss an easy way to implement the approach in a specific model, I remain hopeful for this new approach to make a major dent in the current methodologies!

## variance of an exponential order statistics

Posted in Books, Kids, pictures, R, Statistics, University life with tags climate simulation, ecdf, empirical cdf, ENSAE, George Casella, Luc Devroye, Malakoff, Master program, Monte Carlo Statistical Methods, order statistics, spacings on November 10, 2016 by xi'an**T**his afternoon, one of my Monte Carlo students at ENSAE came to me with an exercise from Monte Carlo Statistical Methods that I did not remember having written. And I thus “charged” George Casella with authorship for that exercise!

Exercise 3.3 starts with the usual question (a) about the (Binomial) precision of a tail probability estimator, which is easy to answer by iterating simulation batches. Expressed via the empirical cdf, it is concerned with the *vertical* variability of this empirical cdf. The second part (b) is more unusual in that the first part is again an evaluation of a tail probability, but then it switches to find the .995 quantile by simulation and produce a precise enough [to three digits] estimate. Which amounts to assess the *horizontal* variability of this empirical cdf.

As we discussed about this question, my first suggestion was to aim at a value of N, number of Monte Carlo simulations, such that the .995 x N-th spacing had a length of less than one thousandth of the .995 x N-th order statistic. In the case of the Exponential distribution suggested in the exercise, generating order statistics is straightforward, since, as suggested by Devroye, see Section V.3.3, the i-th spacing is an Exponential variate with rate (N-i+1). This is so fast that Devroye suggests simulating Uniform order statistics by inverting Exponential order statistics (p.220)!

However, while still discussing the problem with my student, I came to a better expression of the question, which was to figure out the variance of the .995 x N-th order statistic in the Exponential case. Working with the density of this order statistic however led nowhere useful. A bit later, after Google-ing the problem, I came upon this Stack Exchange solution that made use of the spacing result mentioned above, namely that the expectation and variance of the k-th order statistic are

which leads to the proper condition on N when imposing the variability constraint.

## Rémi Bardenet’s seminar

Posted in Kids, pictures, Statistics, Travel, University life with tags ABC in Roma, big data, BiPS, CREST, defense, ENSAE, Institut Henri Poincaré, MCMC algorithms, Monte Carlo Statistical Methods, Nicolas Chopin, PhD thesis, quasi-Monte Carlo methods, seminar, tall data on April 7, 2016 by xi'an**N**ext week, Rémi Bardenet is giving a seminar in Paris, Thursday April 14, 2pm, in ENSAE [room 15] on MCMC methods for tall data. Unfortunately, I will miss this opportunity to discuss with Rémi as I will be heading to La Sapienza, Roma, for Clara Grazian‘s PhD defence the next day. And on Monday afternoon, April 11, Nicolas Chopin will give a talk on quasi-Monte Carlo for sequential problems at Institut Henri Poincaré.

## position opening at ENSAE ParisTech

Posted in Kids, Statistics, Travel, University life with tags associate professor position, École Polytechnique, CREST, ENSAE, machine learning, Paris, Paris-Saclay campus, Statistics on March 28, 2016 by xi'an**T**here is an opening for an associate or full professor position in Statistics and Machine Learning at ENSAE, Paris (soon to move to the Paris-Saclay campus, next to École Polytechnique). The details are provided here. The deadline is April 18, 2016, for a hiring in September or October 2016.