This morning, we had a jam session at the maths department of Paris-Dauphine where a few researchers & colleagues of mine presented their field of research to the whole department. Very interesting despite or thanks to the variety of topics, with forays into the three-body problem(s) [and Poincaré‘s mistake], mean fields for Nash equilibrium (or how to exit a movie theatre), approximate losses in machine learning and so on. Somehow, there was some unity as well through randomness, convexity and optimal transport. One talk close to my own interests was obviously the study of simulation within convex sets by Joseph Lehec from Paris-Dauphine [and Sébastien Bubeck & Ronen Eldan] as they had established a total variation convergence result at a speed only increasing polynomially with the dimension. The underlying simulation algorithm is rather theoretical in that it involves random walk (or Langevin corrected) moves where any excursion outside the convex support is replaced with its projection on the set. Projection that may prove pretty expensive to compute if the convex set is defined for instance as the intersection of many hyperplanes. So I do not readily see how the scheme can be recycled into a competitor to a Metropolis-Hastings solution in that the resulting chain hits the boundary from time to time. With the same frequency over iterations. A solution is to instead use Metropolis-Hastings of course, while another one is to bounce on the boundary and then correct by Metropolis-Hastings… The optimal scales in the three different cases are quite different, from √d in the Metropolis-Hastings cases to d√d in the projection case. (I did not follow the bouncing option to the end, as it lacks a normalising constant.) Here is a quick and not particularly helpful comparison of the exploration patterns of both approaches in dimension 50 for the unit sphere and respective scales of 10/d√d [blue] and 1/√d [gold].
Archive for Université Paris Dauphine
[I first thought this title was highly original but a google search showed me wrong…] This short post to point out to the new blog started by Ingmar Schuster on computational statistics and linguistics. Which, so far, keeps strictly to the discussion of recent research papers (rather than ratiocinating about all kinds of tangential topics like a certain ‘Og…) Some of which we may discuss in parallel. And some not. So keep posted! Ingmar came to Paris-Dauphine for a doctoral visit last Winter and is back as a postdoc (supported by the Fondation des Sciences Mathématiques de Paris) since last Fall. Working with me and Nicolas, among others.
Phew! I just finished my enormous pile of homeworks for the computational statistics course… This massive pile is due to an unexpected number of students registering for the Data Science Master at ENSAE and Paris-Dauphine. As I was not aware of this surge, I kept to my practice of asking students to hand back solved exercises from Monte Carlo Statistical Methods at the beginning of each class. And could not change the rules of the game once the course had started! Next year, I’ll make sure to get some backup for grading those exercises. Or go for group projects instead…
This morning, Clara Grazian and I arXived a paper about Jeffreys priors for mixtures. This is a part of Clara’s PhD dissertation between Roma and Paris, on which she has worked for the past year. Jeffreys priors cannot be computed analytically for mixtures, which is such a drag that it led us to devise the delayed acceptance algorithm. However, the main message from this detailed study of Jeffreys priors is that they mostly do not work for Gaussian mixture models, in that the posterior is almost invariably improper! This is a definite death knell for Jeffreys priors in this setting, meaning that alternative reference priors, like the one we advocated with Kerrie Mengersen and Mike Titterington, or the similar solution in Roeder and Wasserman, have to be used. [Disclaimer: the title has little to do with the paper, except that posterior means are off for mixtures…]