Archive for CREST

end of a long era [1982-2017]

Posted in Books, pictures, Running, University life with tags , , , , , , , , , , , on May 23, 2017 by xi'an

This afternoon I went to CREST to empty my office there from books and a few papers (like the original manuscript version of Monte Carlo Statistical Methods). This is because the research centre, along with the ENSAE graduate school (my Alma mater), is moving to a new building on the Saclay plateau, next to École Polytechnique. As part of this ambitious migration of engineering schools from downtown Paris to a brand new campus there. Without getting sentimental about this move, it means leaving the INSEE building in Malakoff, on the outskirts of downtown Paris, which has been an enjoyable part of my student and then academic life from 1982 till now. And also leaving the INSEE Paris Club runners! (I am quite uncertain about being as active at the new location, if only because going there by bike is a bit more of a challenge. To be addressed anyway!) And I left behind my accumulation of conference badges (although I should try to recycle them for the incoming BNP 11 in Paris!).

Alan Gelfand in Paris

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on May 11, 2017 by xi'an

Alan Gelfand (Duke University) will be in Paris on the week of May 15 and give several seminars, including one at AgroParisTech on May 16:

Modèles hiérarchiques

and on at CREST (BiPS)  on May 18, 2pm:

Scalable Gaussian processes for analyzing space and space-time datasets

truncated normal algorithms

Posted in Books, pictures, R, Statistics, University life with tags , , , , on January 4, 2017 by xi'an

Nicolas Chopin (CREST) just posted an entry on Statisfaction about the comparison of truncated Normal algorithms run by Alan Rogers, from the University of Utah. Nicolas wrote a paper in Statistics and Computing about a simulation method, which proposes a Ziggurat type of algorithm for this purpose, and which I do not remember reading, thanks to my diminishing memory buffer!  As shown in the picture below, when truncating to the half-line (a,∞), this method improves upon my accept-reject approach except in the far tails.

truncanormOn the top graph, made by Alan Rogers, my uniform proposal (r) seems to be doing better for a Normal truncated to (a,b) when b<0, or when a gets large and close to b. Nicolas’ ziggurat (c) works better than the Gaussian accept-reject method (c) on the positive part. (I wonder what the exponential proposal (e) stands for, in terms of scale parameter.)

zig, zag, and subsampling

Posted in Books, Statistics, University life with tags , , , , , , , , , on December 29, 2016 by xi'an

ENSAE, Nov. 17, 2010Today, I alas missed a seminar at BiPS on the Zig-Zag (sub-)sampler of Joris Bierkens, Paul Fearnhead and Gareth Roberts, presented here in Paris by James Ridgway. Fortunately for me, I had some discussions with Murray Pollock in Warwick and then again with Changye Wu in Dauphine that shed some light on this complex but highly innovative approach to simulating in Big Data settings thanks to a correct subsampling mechanism.

The zig-zag process runs a continuous process made of segments that turn from one diagonal to the next at random times driven by a generator connected with the components of the gradient of the target log-density. Plus a symmetric term. Provided those random times can be generated, this process is truly available and associated with the right target distribution. When the components of the parameter are independent (an unlikely setting), those random times can be associated with an inhomogeneous Poisson process. In the general case, one needs to bound the gradients by more manageable functions that create a Poisson process that can later be thinned. Next, one needs to simulate the process for the upper bound, a task that seems hard to achieve apart from linear and piecewise constant upper bounds. The process has a bit of a slice sampling taste, except that it cannot be used as a slice sampler but requires continuous time integration, given that the length of each segment matters. (Or maybe random time subsampling?)

A highly innovative part of the paper concentrates on Big Data likelihoods and on the possibility to subsample properly and exactly the original dataset. The authors propose Zig-Zag with subsampling by turning the gradients into random parts of the gradients. While remaining unbiased. There may be a cost associated with this gain of one to n, namely that the upper bounds may turn larger as they handle all elements in the likelihood at once, hence become (even) less efficient. (I am more uncertain about the case of the control variates, as it relies on a Lipschitz assumption.) While I still miss an easy way to implement the approach in a specific model, I remain hopeful for this new approach to make a major dent in the current methodologies!

control functionals for Monte Carlo integration

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on June 28, 2016 by xi'an

img_2451A paper on control variates by Chris Oates, Mark Girolami (Warwick) and Nicolas Chopin (CREST) appeared in a recent issue of Series B. I had read and discussed the paper with them previously and the following is a set of comments I wrote at some stage, to be taken with enough gains of salt since Chris, Mark and Nicolas answered them either orally or in the paper. Note also that I already discussed an earlier version, with comments that are not necessarily coherent with the following ones! [Thanks to the busy softshop this week, I resorted to publish some older drafts, so mileage can vary in the coming days.]

First, it took me quite a while to get over the paper, mostly because I have never worked with reproducible kernel Hilbert spaces (RKHS) before. I looked at some proofs in the appendix and at the whole paper but could not spot anything amiss. It is obviously a major step to uncover a manageable method with a rate that is lower than √n. When I set my PhD student Anne Philippe on the approach via Riemann sums, we were quickly hindered by the dimension issue and could not find a way out. In the first versions of the nested sampling approach, John Skilling had also thought he could get higher convergence rates before realising the Monte Carlo error had not disappeared and hence was keeping the rate at the same √n speed.

The core proof in the paper leading to the 7/12 convergence rate relies on a mathematical result of Sun and Wu (2009) that a certain rate of regularisation of the function of interest leads to an average variance of order 1/6. I have no reason to mistrust the result (and anyway did not check the original paper), but I am still puzzled by the fact that it almost immediately leads to the control variate estimator having a smaller order variance (or at least variability). On average or in probability. (I am also uncertain on the possibility to interpret the boxplot figures as establishing super-√n speed.)

Another thing I cannot truly grasp is how the control functional estimator of (7) can be both a mere linear recombination of individual unbiased estimators of the target expectation and an improvement in the variance rate. I acknowledge that the coefficients of the matrices are functions of the sample simulated from the target density but still…

Another source of inner puzzlement is the choice of the kernel in the paper, which seems too simple to be able to cover all problems despite being used in every illustration there. I see the kernel as centred at zero, which means a central location must be know, decreasing to zero away from this centre, so possibly missing aspects of the integrand that are too far away, and isotonic in the reference norm, which also seems to preclude some settings where the integrand is not that compatible with the geometry.

I am equally nonplussed by the existence of a deterministic bound on the error, although it is not completely deterministic, depending on the values of the reproducible kernel at the points of the sample. Does it imply anything restrictive on the function to be integrated?

A side remark about the use of intractable in the paper is that, given the development of a whole new branch of computational statistics handling likelihoods that cannot be computed at all, intractable should possibly be reserved for such higher complexity models.

Rémi Bardenet’s seminar

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on April 7, 2016 by xi'an

Grand Palais from Esplanade des Invalides, Paris, Dec. 07, 2012Next week, Rémi Bardenet is giving a seminar in Paris, Thursday April 14, 2pm, in ENSAE [room 15] on MCMC methods for tall data. Unfortunately, I will miss this opportunity to discuss with Rémi as I will be heading to La Sapienza, Roma, for Clara Grazian‘s PhD defence the next day.  And on Monday afternoon, April 11, Nicolas Chopin will give a talk on quasi-Monte Carlo for sequential problems at Institut Henri Poincaré.

position opening at ENSAE ParisTech

Posted in Kids, Statistics, Travel, University life with tags , , , , , , , on March 28, 2016 by xi'an

ensaeprofParis and la Seine, from Pont du Garigliano, Oct. 20, 2011There is an opening for an associate or full professor position in Statistics and Machine Learning at ENSAE, Paris (soon to move to the Paris-Saclay campus, next to École Polytechnique). The details are provided here. The deadline is April 18, 2016, for a hiring in September or October 2016.