Archive for ENSAE

faculty positions in statistics at ENSAE, Paris

Posted in Statistics, University life with tags , , , , , , on April 28, 2014 by xi'an

Here is a call from ENSAE about two positions in statistics/machine learning, starting next semester:

ENSAE ParisTech and CREST is currently inviting applications for one position at the level associate or full professor from outstanding candidates having demonstrated abilities in both research and teaching. We are interested in candidates with a Ph.D. in Statistics or Machine Learning (or related field) whose research interests are in high dimensional statistical inference, learning theory or statistics of networks.

The appointment could begin as soon as September 1, 2014. The position is for an initial three-year term, with a possible renewal option in case of positive evaluation of research and teaching activities. Salary for suitably qualified applicants is competitive and commensurate with experience. The deadline for application is May 19, 2014.  Full details are given here for the first position and there for the second position.

data scientist position

Posted in R, Statistics, University life with tags , , , , , , , , , , on April 8, 2014 by xi'an

Université Paris-DauphineOur newly created Chaire “Economie et gestion des nouvelles données” in Paris-Dauphine, ENS Ulm, École Polytechnique and ENSAE is recruiting a data scientist starting as early as May 1, the call remaining open till the position is filled. The location is in one of the above labs in Paris, the duration for at least one year, salary is varying, based on the applicant’s profile, and the contacts are Stephane Gaiffas (stephane.gaiffas AT cmap DOT polytechnique.fr), Robin Ryder (ryder AT ceremade DOT dauphine.fr). and Gabriel Peyré (peyre AT ceremade DOT dauphine.fr). Here are more details:

Job description

The chaire “Economie et gestion des nouvelles données” is recruiting a talented young engineer specialized in large scale computing and data processing. The targeted applications include machine learning, imaging sciences and finance. This is a unique opportunity to join a newly created research group between the best Parisian labs in applied mathematics and computer science (ParisDauphine, ENS Ulm, Ecole Polytechnique and ENSAE) working hand in hand with major industrial companies (Havas, BNP Paribas, Warner Bros.). The proposed position consists in helping researchers of the group to develop and implement large scale data processing methods, and applying these methods on real life problems in collaboration with the industrial partners.

A non exhaustive list of methods that are currently investigated by researchers of the group, and that will play a key role in the computational framework developed by the recruited engineer, includes :
● Large scale non smooth optimization methods (proximal schemes, interior points, optimization on manifolds).
● Machine learning problems (kernelized methods, Lasso, collaborative filtering, deep learning, learning for graphs, learning for timedependent systems), with a particular focus on large scale problems and stochastic methods.
● Imaging problems (compressed sensing, superresolution).
● Approximate Bayesian Computation (ABC) methods.
● Particle and Sequential Monte Carlo methods

Candidate profile

The candidate should have a very good background in computer science with various programming environments (e.g. Matlab, Python, C++) and knowledge of high performance computing methods (e.g. GPU, parallelization, cloud computing). He/she should adhere to the open source philosophy and possibly be able to interact with the relevant communities (e.g. scikitlearn initiative). Typical curriculum includes engineering school or Master studies in computer science / applied maths / physics, and possibly a PhD (not required).

Working environment

The recruited engineer will work within one of the labs of the chaire. He will benefit from a very stimulating working environment and all required computing resources. He will work in close interaction with the 4 research labs of the chaire, and will also have regular meetings with the industrial partners. More information about the chaire can be found online at http://www.di.ens.fr/~aspremon/chaire/

fine-sliced Poisson [a.k.a. sashimi]

Posted in Books, Kids, pictures, R, Running, Statistics, University life with tags , , , , , , , , , on March 20, 2014 by xi'an

As my student Kévin Guimard had not mailed me his own Poisson slice sampler of a Poisson distribution, I could not tell why the code was not working! My earlier post prompted him to do so and a somewhat optimised version is given below:

nsim = 10^4
lambda = 6

max.factorial = function(x,u){
        k = x
        parf=1
        while (parf*u<1){
          k = k + 1
          parf = parf * k
          }
        k = k - (parf*u>1)
        return (k)
        }

x = rep(floor(lambda), nsim)
for (t in 2:nsim){
        v1 = ceiling((log(runif(1))/log(lambda))+x[t-1])
        ranj=max(0,v1):max.factorial(x[t-1],runif(1))
        x[t]=sample(ranj,size=1)
        }
barplot(as.vector(rbind(
   table(x)/length(x),dpois(min(x):max(x),
   lambda))),col=c("sienna","gold"))

As you can easily check by running the code, it does not work. My student actually majored my MCMC class and he spent quite a while pondering why the code was not working. I did ponder as well for a part of a morning in Warwick, removing causes for exponential or factorial overflows (hence the shape of the code), but not eliciting the issue… (This now sounds like lethal fugu sashimi! ) Before reading any further, can you spot the problem?!

The corrected R code is as follows:

x = rep(lambda, nsim)
for (t in 2:nsim){
        v1=ceiling((log(runif(1))/log(lambda))+x[t-1])
        ranj=max(0,v1):max.factorial(x[t-1],runif(1))
        if (length(ranj)>1){
          x[t] = sample(ranj, size = 1)
          }else{
                x[t]=ranj}
 }

The culprit is thus the R function sample which simply does not understand Dirac masses and the basics of probability! When running

> sample(150:150,1)
[1] 23

you can clearly see where the problem stands…! Well-documented issue with sample that already caused me woes… Another interesting thing about this slice sampler is that it is awfully slow in exploring the tails. And to converge to the centre from the tails. This is not very pronounced in the above graph with a mean of 6. Moving to 50 makes it more apparent:

slisson5This is due to the poor mixing of the chain, as shown by the raw sequence below, which strives to achieve a single cycle out of 10⁵ iterations! In any case, thanks to Kévin for an interesting morning!

slisson4

sliced Poisson

Posted in Books, Kids, pictures, R, Running, Statistics, University life with tags , , , , on March 18, 2014 by xi'an

slissonOne of my students complained that his slice sampler of a Poisson distribution was not working when following the instructions in Monte Carlo Statistical Methods (Exercise 8.5). This puzzled me during my early morning run and I checked on my way back, even before attacking the fresh baguette I had brought from the bakery… The following R code is the check. And it does work! As the comparison above shows…

slice=function(el,u){
#generate uniform over finite integer set
   mode=floor(lambda)
   sli=mode
   x=mode+1
   while (dpois(x,el)>u){
       sli=c(sli,x);x=x+1}
   x=mode-1
   while (dpois(x,el)>u){
       sli=c(sli,x);x=x-1}
   return(sample(sli,1))}

#example
T=10^4
lambda=2.414

x=rep(floor(lambda),T)
for (t in 2:T)
   x[t]=slice(lambda,runif(1)*dpois(x[t-1],lambda))

barplot(as.vector(rbind(
   table(x)/length(x),dpois(0:max(x),
   lambda))),col=c("sienna","gold"))

Gelman’s course in Paris, next term!

Posted in Books, Kids, Statistics, University life with tags , , , on August 2, 2013 by xi'an

Andrew Gelman will be visiting Paris-Dauphine and CREST next academic year, with support from those institutions as well as CNRS and Ville de Paris). Which is why he is learning how to pronounce Le loup est revenu. (Maybe not why, as this is not the most useful sentence in downtown Paris…) Very exciting news for all of us local Bayesians (or bayésiens). In addition, Andrew will teach from the latest edition of his book Bayesian Data Analysis, co-authored by John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin. He will actually start teaching mi-October, which means the book will not be out yet: so the students at Paris-Dauphine and ENSAE will get a true avant-première of Bayesian Data Analysis. Of course, this item of information will be sadistically tantalising to ‘Og’s readers who cannot spend the semester in Paris. For those who can, I presume there is a way to register for the course as auditeur libre at either Paris-Dauphine or ENSAE.

Note that the cover links with an earlier post of Aki on Andrew’s blog about the holiday effect. (Also mentioned earlier on the ‘Og…)

top model choice week (#2)

Posted in Statistics, University life with tags , , , , , , , , , , , , on June 18, 2013 by xi'an

La Défense and Maison-Lafitte from my office, Université Paris-Dauphine, Nov. 05, 2011Following Ed George (Wharton) and Feng Liang (University of Illinois at Urbana-Champaign) talks today in Dauphine, Natalia Bochkina (University of Edinburgh) will  give a talk on Thursday, June 20, at 2pm in Room 18 at ENSAE (Malakoff) [not Dauphine!]. Here is her abstract:

2 am: Simultaneous local and global adaptivity of Bayesian wavelet estimators in nonparametric regression by Natalia Bochkina

We consider wavelet estimators in the context of nonparametric regression, with the aim of finding estimators that simultaneously achieve the local and global adaptive minimax rate of convergence. It is known that one estimator – James-Stein block thresholding estimator of T.Cai (2008) – achieves simultaneously both optimal rates of convergence but over a limited set of Besov spaces; in particular, over the sets of spatially inhomogeneous functions (with 1≤ p<2) the upper bound on the global rate of this estimator is slower than the optimal minimax rate.

Another possible candidate to achieve both rates of convergence simultaneously is the Empirical Bayes estimator of Johnstone and Silverman (2005) which is an adaptive estimator that achieves the global minimax rate over a wide rage of Besov spaces and Besov balls. The maximum marginal likelihood approach is used to estimate the hyperparameters, and it can be interpreted as a Bayesian estimator with a uniform prior. We show that it also achieves the adaptive local minimax rate over all Besov spaces, and hence it does indeed achieve both local and global rates of convergence simultaneously over Besov spaces. We also give an example of how it works in practice.

master traitement statistique de l’information (TSI)

Posted in Statistics, University life with tags , , , , , , on April 11, 2013 by xi'an

Slides (in French) of a presentation of my Master TSI in ENSAE today:

Follow

Get every new post delivered to your Inbox.

Join 634 other followers