Andrew Gelman will be visiting Paris-Dauphine and CREST next academic year, with support from those institutions as well as CNRS and Ville de Paris). Which is why he is learning how to pronounce Le loup est revenu. (Maybe not why, as this is not the most useful sentence in downtown Paris…) Very exciting news for all of us local Bayesians (or bayésiens). In addition, Andrew will teach from the latest edition of his book Bayesian Data Analysis, co-authored by John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin. He will actually start teaching mi-October, which means the book will not be out yet: so the students at Paris-Dauphine and ENSAE will get a true avant-première of Bayesian Data Analysis. Of course, this item of information will be sadistically tantalising to ‘Og’s readers who cannot spend the semester in Paris. For those who can, I presume there is a way to register for the course as auditeur libre at either Paris-Dauphine or ENSAE.
Archive for graduate course
Next month, Michael Jordan will give an advanced course at CREST-ENSAE, Paris, on Recent Advances at the Interface of Computation and Statistics. The course will take place on April 4 (14:00, ENSAE, Room #11), 11 (14:00, ENSAE, Room #11), 15 (11:00, ENSAE, Room #11) and 18 (14:00, ENSAE, Room #11). It is open to everyone and attendance is free. The only constraint is a compulsory registration with Nadine Guedj (email: guedj[AT]ensae.fr) for security issues. I strongly advise all graduate students who can take advantage of this fantastic opportunity to grasp it! Here is the abstract to the course:
“I will discuss several recent developments in areas where statistical science meets computational science, with particular concern for bringing statistical inference into contact with distributed computing architectures and with recursive data structures :
How does one obtain confidence intervals in massive data sets? The bootstrap principle suggests resampling data to obtain fluctuations in the values of estimators, and thereby confidence intervals, but this is infeasible computationally with massive data. Subsampling the data yields fluctuations on the wrong scale, which have to be corrected to provide calibrated statistical inferences. I present a new procedure, the “bag of little bootstraps,” which circumvents this problem, inheriting the favorable theoretical properties of the bootstrap but also having a much more favorable computational profile.
The problem of matrix completion has been the focus of much recent work, both theoretical and practical. To take advantage of distributed computing architectures in this setting, it is natural to consider divide-and-conquer algorithms for matrix completion. I show that these work well in practice, but also note that new theoretical problems arise when attempting to characterize the statistical performance of these algorithms. Here the theoretical support is provided by concentration theorems for random matrices, and I present a new approach to matrix concentration based on Stein’s method.
Bayesian nonparametrics involves replacing the “prior distributions” of classical Bayesian analysis with “prior stochastic processes.” Of particular value are the class of “combinatorial stochastic processes,” which make it possible to express uncertainty (and perform inference) over combinatorial objects that are familiar as data structures in computer science.”
References are available on Michael’s homepage.
As mentioned in the latest post on ABC, I am giving a short doctoral course on ABC methods and convergence at CREST next week. I have now made a preliminary collection of my slides (plus a few from Jean-Michel Marin’s), available on slideshare (as ABC in Roma, because I am also giving the course in Roma, next month, with an R lab on top of it!):
and I did manage to go over the book by Gouriéroux and Monfort on indirect inference over the weekend. I still need to beef up the slides before the course starts next Thursday! (The core version of the slides is actually from the course I gave in Wharton more than a year ago.)
(This post got published on The Statistics Forum yesterday.)
The short book review section of the International Statistical Review sent me Raquel Prado’s and Mike West’s book, Time Series (Modeling, Computation, and Inference) to review. The current post is not about this specific book, but rather on why I am unsatisfied with the textbooks in this area (and correlatively why I am always reluctant to teach a graduate course on the topic). Again, I stress that the following is not specifically about the book by Raquel Prado and Mike West!
With the noticeable exception of Brockwell and Davis’ Time Series: Theory and Methods, most time-series books seem to suffer (in my opinion) from the same difficulty, which sums up as being unable to provide the reader with a coherent and logical description of/introduction to the field. (This echoes a complaint made by Håvard Rue a few weeks ago in Zurich.) Instead, time-series books appear to haphazardly pile up notions and techniques, theory and methods, without paying much attention to the coherency of the presentation. That’s how I was introduced to the field (even though it was by a fantastic teacher!) and the feeling has not left me since then. It may be due to the fact that the field stemmed partly from signal processing in engineering and partly from econometrics, but such presentations never achieve a Unitarian front on how to handle time-series. In particular, the opposition between the time domain and the frequency domain always escapes me. This is presumably due to my inability to see the relevance of the spectral approach, as harmonic regression simply appears (to me) as a special type of non-linear regression with sinusoidal regressors and with a well-defined likelihood that does not require Fourier frequencies nor periodogram (nor either spectral density estimation). Even within the time domain, I find the handling of stationarity by time-series book to be mostly cavalier. Why stationarity is important is never addressed, which leads to the reader being left with the hard choice between imposing stationarity and not imposing stationarity. (My original feeling was to let the issue being decided by the data, but this is not possible!) Similarly, causality is often invoked as a reason to set constraints on MA coefficients, even though this resorts to a non-mathematical justification, namely preventing dependence on the future. I thus wonder if being an Unitarian (i.e. following a single logical process for analysing time-series data) is at all possible in the time-series world! E.g., in Bayesian Core, we processed AR, MA, ARMA models in a single perspective, conditioning on the initial values of the series and imposing all the usual constraints on the roots of the lag polynomials but this choice was far from perfectly justified…
In what seems like an endless cuRse, I found this week I had to re-grade a dozen R exams a TA’s did not grade properly! The grades I (X) got are plotted below against those of my TA (Y). There is little connection between both gradings… As if this was not enough trouble, I also found exactly duplicated R codes in another R project around Introducing Monte Carlo methods with R that was returned a few weeks ago. Meaning I will have to draft a second round exam… (As Tom commented on an earlier post, team resolution of a given problem may be a positive attitude, but in the current case one student provided an A⁺⁺ answer, while two others clearly drafted an hasty resolution from the original.) Nonetheless, do not worry, I still love [teaching] R!