Archive for Université Paris-Sud

talk in Orsay (message in a beetle)

Posted in Kids, Statistics, Travel, University life with tags , , , , , , , , on April 5, 2014 by xi'an

IMG_0161Yesterday (March 27), I gave a seminar at Paris-Sud University, Orsay, in the stats department, on ABC model choice. It was an opportunity to talk about recent advances we have made with Jean-Michel Marin and Pierre Pudlo on using machine-learning devices to improve ABC. (More to come soon!) And to chat with Gilles Celeux about machine learning and classification. Actually, given that one of my examples was about the Asian lady beetle invasion and that the buildings of the Paris-Sud University have suffered from this invasion, I should have advertised the talk with the more catchy title of “message in a beetle”…

This seminar was also an opportunity to experiment with mixed transportation. Indeed, since I had some errands to run in Paris in morning I decided to bike there (in Paris), work at CREST, and then take my bike in the RER train down to Orsay as I did not have the time and leisure to bike all the 20k there. Since it was the middle of the day, the carriage was mostly empty and I managed to type a blog entry without having to worry about the bike being a nuisance…. The only drag was to enter the platform in Paris (Cité Universitaire) as there was no clear access for bike. Fortunately, a student kindly helped me to get over the gate with my bike, as I could not manage on my own… Nonetheless, I will certainly repeat the experience on my next trip to Orsay (but would not dare take the bike inside/under Paris per se because of the (over-)crowded carriages there).

R finals

Posted in R, Statistics, University life with tags , , , , , , , , on January 31, 2013 by xi'an

From my office in Dauphine, on the hottest day of the year (so far)...On the morning I returned from Varanasi and the ISBA meeting there, I had to give my R final exam (along with three of my colleagues in Paris-Dauphine). This year, the R course was completely in English, exam included, which means I can post it here as it may attract more interest than the French examens of past years…

I just completed grading my 32 copies, all from exam A, which takes a while as I have to check (and sometimes recover) the R code, and often to correct the obvious mistakes to see if the deeper understanding of the concepts is there. This year student cohort is surprisingly homogeneous: I did not spot any of the horrors I may have mentioned in previous posts.

I must alas acknowledge a grievous typo in the version of Exam B that was used the day of the final: cutting-and-pasting from A to B, I forgot to change the parameters in Exercise 2, asking them to simulate a Gamma(0,1). It is only after half an hour that a bright student pointed out the impossibility… We had tested the exams prior to printing them but this somehow escaped the four of us!

Now, as I was entering my grades into the global spreadsheet, I noticed a perfect… lack of correlation between those and the grades at the midterm exam. I wonder what that means: I could be grading at random, the levels in November and in January could be uncorrelated, some students could have cheated in November and others in January, student’s names or file names got mixed up, …? A rather surprising outcome!

grades of some of my students at the midterm and finals R exams

AMOR at 5000ft in a water tank…

Posted in Mountains, pictures, Statistics, University life with tags , , , , , , , , , , , , , , on November 22, 2012 by xi'an

On Monday, I attended the thesis defence of Rémi Bardenet in Orsay as a member (referee) of his thesis committee. While this was a thesis in computer science, which took place in the Linear Accelerator Lab in Orsay, it was clearly rooted in computational statistics, hence justifying my presence in the committee. The justification (!) for the splashy headline of this post is that Rémi’s work was motivated by the Pierre-Auger experiment on ultra-high-energy cosmic rays, where particles are detected through a network of 1600 water tanks spread over the Argentinian Pampa Amarilla on an area the size of Rhode Island (where I am incidentally going next week).

The part of Rémi’s thesis presented during the defence concentrated on his AMOR algorithm, arXived in a paper written with Olivier Cappé and Gersende Fort. AMOR stands for adaptive Metropolis online relabelling and combines adaptive MCMC techniques with relabelling strategies to fight label-switching (e.g., in mixtures). I have been interested in mixtures for eons (starting in 1987 in Ottawa with applying Titterington, Smith, and Makov to chest radiographs) and in label switching for ages (starting at the COMPSTAT conférence in Bristol in 1998). Rémi’s approach to the label switching problem follows the relabelling path, namely a projection of the original parameter space into a smaller subspace (that is also a quotient space) to avoid permutation invariance and lack of identifiability. (In the survey I wrote with Kate Lee, Jean-Michel Marin and Kerrie Mengersen, we suggest using the mode as a pivot to determine which permutation to use on the components of the mixture.) The paper suggests using an Euclidean distance to a mean determined adaptively, μt, with a quadratic form Σt also determined on-the-go, minimising (Pθ-μt)TΣt(Pθ-μt) over all permutations P at each step of the algorithm. The intuition behind the method is that the posterior over the restricted space should look like a roughly elliptically symmetric distribution, or at least like a unimodal distribution, rather than borrowing bits and pieces from different modes. While I appreciate the technical tour de force represented by the proof of convergence of the AMOR algorithm, I remain somehow sceptical about the approach and voiced the following objections during the defence: first, the assumption that the posterior becomes unimodal under an appropriate restriction is not necessarily realistic. Secondary modes often pop in with real data (as in the counter-example we used in our paper with Alessandra Iacobucci and Jean-Michel Marin). Next, the whole apparatus of fighting multiple modes and non-identifiability, i.e. fighting label switching, is to fall back on posterior means as Bayes estimators. As stressed in our JASA paper with Gilles Celeux and Merrilee Hurn, there is no reason for doing so and there are several reasons for not doing so:

  • it breaks down under model specification, i.e., when the number of components is not correct
  • it does not improve the speed of convergence but, on the opposite, restricts the space visited by the Markov chain
  • it may fall victim to the fatal attraction of secondary modes by fitting too small an ellipse around one of those modes
  • it ultimately depends on the parameterisation of the model
  • there is no reason for using posterior means in mixture problems, posterior modes or cluster centres can be used instead

I am therefore very much more in favour of producing a posterior distribution that is as label switching as possible (since the true posterior is completely symmetric in this respect). Post-processing the resulting sample can be done by using off-the-shelf clustering in the component space, derived from the point process representation used by Matthew Stephens in his thesis and subsequent papers. It also allows for a direct estimation of the number of components.

In any case, this was a defence worth-attending that led me to think afresh about the label switching problem, with directions worth exploring next month while Kate Lee is visiting from Auckland. Rémi Bardenet is now headed for a postdoc in Oxford, a perfect location to discuss further label switching and to engage into new computational statistics research!

from my office

Posted in pictures, University life with tags , , , , on December 24, 2011 by xi'an

A seminar I will sadly miss

Posted in Books, Statistics, Travel, University life with tags , , , , , on January 4, 2011 by xi'an

Next Friday, January 7, Jean Claude Saut will give a seminar in Orsay (bâtiment 425, salle 113, 13:30) on “Autour de `A Treatise on Probability’ de John Maynard Keynes  (II)”. I would have liked very much to be there and hear about a mathematician’s views on this book, which are most likely orthogonal to mines… For those in Paris this week there also is a Big’MC seminar on Thursday (IHP, 3pm) with Jean-Louis Foulley on evidence computation (in connection with Ando’s book) and Gilles Celeux (on latent blocks).

Follow

Get every new post delivered to your Inbox.

Join 703 other followers