Slides (in French) of a presentation of my Master TSI in ENSAE today:
Archive for ENSAE
Next month, Michael Jordan will give an advanced course at CREST-ENSAE, Paris, on Recent Advances at the Interface of Computation and Statistics. The course will take place on April 4 (14:00, ENSAE, Room #11), 11 (14:00, ENSAE, Room #11), 15 (11:00, ENSAE, Room #11) and 18 (14:00, ENSAE, Room #11). It is open to everyone and attendance is free. The only constraint is a compulsory registration with Nadine Guedj (email: guedj[AT]ensae.fr) for security issues. I strongly advise all graduate students who can take advantage of this fantastic opportunity to grasp it! Here is the abstract to the course:
“I will discuss several recent developments in areas where statistical science meets computational science, with particular concern for bringing statistical inference into contact with distributed computing architectures and with recursive data structures :
How does one obtain confidence intervals in massive data sets? The bootstrap principle suggests resampling data to obtain fluctuations in the values of estimators, and thereby confidence intervals, but this is infeasible computationally with massive data. Subsampling the data yields fluctuations on the wrong scale, which have to be corrected to provide calibrated statistical inferences. I present a new procedure, the “bag of little bootstraps,” which circumvents this problem, inheriting the favorable theoretical properties of the bootstrap but also having a much more favorable computational profile.
The problem of matrix completion has been the focus of much recent work, both theoretical and practical. To take advantage of distributed computing architectures in this setting, it is natural to consider divide-and-conquer algorithms for matrix completion. I show that these work well in practice, but also note that new theoretical problems arise when attempting to characterize the statistical performance of these algorithms. Here the theoretical support is provided by concentration theorems for random matrices, and I present a new approach to matrix concentration based on Stein’s method.
Bayesian nonparametrics involves replacing the “prior distributions” of classical Bayesian analysis with “prior stochastic processes.” Of particular value are the class of “combinatorial stochastic processes,” which make it possible to express uncertainty (and perform inference) over combinatorial objects that are familiar as data structures in computer science.”
References are available on Michael’s homepage.
On Monday, Ildar Ibragimov (St.Petersburg Department of Steklov Mathematical Institute, Russia) will give a seminar at CREST on “The Darmois – Skitovich and Ghurye – Olkin theorems revisited“. This sounds more like probability than statistics, as those theorems state that, if two linear combinations of iid rv’s are independent, then those rv’s are normal. See those remarks by Prof. Abram Kagan for historical details. Nonetheless, I find it quite an event to have a local seminar given by one of the fathers of asymptotic Bayesian theory. Here is the abstract to the talk. (The talk will be at ENSAE, Salle S8, at 3pm on Monday, March 18.)
Similar to last year, I am giving a series of lectures on simulation jointly as a Master course in Paris-Dauphine and as a 3rd year course in ENSAE. The course borrows from both the books Monte Carlo Statistical Methods and from Introduction to Monte Carlo Methods with R, with George Casella. Here are the three series of slides I will use throughout the course this year, mostly for the benefit of the students:
(the last series is much improved when compared with an earlier version, thanks to Olivier Cappé!)
Next month, Kerrie Mengersen (QUT, Brisbane, Australia, and visiting us at CREST and Paris-Dauphine this coming May) will give a PhD course at CREST on the theme of applied Bayesian statistical modelling.
Here is her abstract:
Bayesian hierarchical models are now widely used in addressing a rich variety of real-world problems. In this course, we will examine some common models and the associated computational methods used to solve these problems, with a focus on environmental and health applications.
Two types of hierarchical models will be considered, namely mixture models and spatial models. Computational methods will cover Markov chain Monte Carlo, Variational Bayes and Approximate Bayesian Computation.
Participants will have the opportunity to implement these approaches using a number of datasets taken from real case studies, including the analysis of digital images from animals and satellites, and disease mapping for medicine and biosecurity.
The classes will take place at ENSAE, Paris, on May 3, 10 (14:00, Amphi 2), 14, and 21 (11:00, Room S8). (The course is open to everyone and free of charge, but registrations are requested, please contact Nadine Guedj.)
On Thursday, March 08, Éric Marchand (from Université de Sherbrooke, Québec, where I first heard of MCMC!, and currently visiting Université de Montpellier 2) will give a seminar at CREST. It is scheduled at 2pm in ENSAE (ask the front desk for the room!) and is related to a recent EJS paper with Dominique Fourdrinier, Ali Righi, and Bill Strawderman: here is the abstract from the paper (sorry, the pictures from Roma are completely unrelated, but I could not resist!):
We consider the problem of predictive density estimation for normal models under Kullback-Leibler loss (KL loss) when the parameter space is constrained to a convex set. More particularly, we assume that
is observed and that we wish to estimate the density of
under KL loss when μ is restricted to the convex set C⊂ℝp. We show that the best unrestricted invariant predictive density estimator p̂U is dominated by the Bayes estimator p̂πC associated to the uniform prior πC on C. We also study so called plug-in estimators, giving conditions under which domination of one estimator of the mean vector μ over another under the usual quadratic loss, translates into a domination result for certain corresponding plug-in density estimators under KL loss. Risk comparisons and domination results are also made for comparisons of plug-in estimators and Bayes predictive density estimators. Additionally, minimaxity and domination results are given for the cases where: (i) C is a cone, and (ii) C is a ball.
Dealing with intractability: recent advances in Bayesian Monte-Carlo methods for intractable likelihoods
(joint works with P. Jacob, O. Papaspiliopoulos and S. Barthelmé)
This talk will start with a review of recent advancements in Monte Carlo methodology for intractable problems; that is problems involving intractable quantities, typically intractable likelihoods. I will discuss in turn ABC type methods (a.k.a. likelihood-free), auxiliary variable methods for dealing with intractable normalising constants (e.g. the exchange algorithm), and MC² type of algorithms, a recent extension of which being the PMCMC algorithm (Andrieu et al., 2010). Then, I will present two recent pieces of work in these direction. First, and more briefly briefly, I’ll present the ABC-EP algorithm (Chopin and Barthelmé, 2011). I’ll also discuss some possible future research in ABC theory. Second, I’ll discuss the SMC² algorithm (Chopin, Jacob and Papaspiliopoulos, 2011), a new type of MC² algorithm that makes it possible to perform sequential analysis for virtually any state-space models, including models with an intractable Markov transition.