Since Stan Ulam is buried in Cimetière du Montparnasse, next to CREST, Andrew and I paid his grave a visit on a sunny July afternoon. Among elaborate funeral constructions, the Aron family tomb is sober and hidden behind funeral houses. It came as a surprise to me to discover that Ulam had links with France to the point of him and his wife being buried in Ulam’s wife family vault. Since we were there, we took a short stroll to see Henri Poincaré’s tomb in the Poincaré-Boutroux vault (missing Henri’s brother, the French president Raymond Poincaré). It came as a surprise that someone had left a folder with the cover of 17 equations that changed the World on top of the tomb). Even though the book covers Poincaré’s work on the three body problem as part of Newton’s formula. There were other mathematicians in this cemetery, but this was enough necrophiliac tourism for one day.
Archive for Paris
My Paris-Dauphine colleague Guillaume Carlier recently arXived a statistics paper entitled Vector quantile regression, co-written with Chernozhukov and Galichon. I was most curious to read the paper as Guillaume is primarily a mathematical analyst working on optimisation problems like optimal transport. And also because I find quantile regression difficult to fathom as a statistical problem. (As it happens, both his co-authors are from econometrics.) The results in the paper are (i) to show that a d-dimensional (Lebesgue) absolutely continuous random variable Y can always be represented as the deterministic transform Y=Q(U), where U is a d-dimensional [0,1] uniform (the paper expresses this transform as conditional on a set of regressors Z, but those essentially play no role) and Q is monotonous in the sense of being the gradient of a convex function,
(ii) to deduce from this representation a unique notion of multivariate quantile function; and (iii) to consider the special case when the quantile function Q can be written as the linear
where β(U) is a matrix. Hence leading to an estimation problem.
While unsurprising from a measure theoretic viewpoint, the representation theorem (i) is most interesting both for statistical and simulation reasons. Provided the function Q can be easily estimated and derived, respectively. The paper however does not provide a constructive tool for this derivation, besides indicating several characterisations as solutions of optimisation problems. From a statistical perspective, a non-parametric estimation of β(.) would have useful implications in multivariate regression, although the paper only considers the specific linear case above. Which solution is obtained by a discretisation of all variables and linear programming.
Today I gave a talk on Bayesian model choice in a fabulous 13th Century former monastery in the Latin Quarter of Paris… It is the Collège des Bernardins, close to Jussieu and Collège de France, unbelievably hidden to the point I was not aware of its existence despite having studied and worked in Jussieu since 1982… I mixed my earlier San Antonio survey on importance sampling approximations to Bayes factors with an entry to our most recent work on ABC with random forests. This was the first talk of the 8th R/Rmetrics workshop taking place in Paris this year. (Rmetrics is aiming at aggregating R packages with econometrics and finance applications.) And I had a full hour and a half to deliver my lecture to the workshop audience. Nice place, nice people, new faces and topics (and even andouille de Vire for lunch!): why should I complain with an alas in the title?!What happened is that the R/Rmetrics meetings have been till this year organised in Meielisalp, Switzerland. Which stands on top of Thuner See and… just next to the most famous peaks of the Bernese Alps! And that I had been invited last year but could not make it… Meaning I lost a genuine opportunity to climb one of my five dream routes, the Mittelegi ridge of the Eiger. As the future R/Rmetrics meetings will not take place there.
A lunch discussion at the workshop led me to experiment the compiler library in R, library that I was unaware of. The impact on the running time is obvious: recycling the fowler function from the last Le Monde puzzle,
> bowler=cmpfun(fowler) > N=20;n=10;system.time(fowler(pred=N)) user system elapsed 52.647 0.076 56.332 > N=20;n=10;system.time(bowler(pred=N)) user system elapsed 51.631 0.004 51.768 > N=20;n=15;system.time(bowler(pred=N)) user system elapsed 51.924 0.024 52.429 > N=20;n=15;system.time(fowler(pred=N)) user system elapsed 52.919 0.200 61.960
shows a ten- to twenty-fold gain in system time, if not in elapsed time (re-alas!).
A similar flier, a few days later. With very precise (if incoherent) guarantees! And a fantastic use of capitals. Too bad Monsieur Cardoso could not predict the (occurrence of a) missing noun in the last sentence…