Archive for Argentina

Nature tidbits

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on September 18, 2018 by xi'an

In the Nature issue of July 19 that I read in the plane to Singapore, there was a whole lot of interesting entries, from various calls expressing deep concern about the anti-scientific stance of the Trump administration, like cutting funds for environmental regulation and restricting freedom of communication (ETA) or naming a non-scientist at the head of NASA and other agencies, or again restricting the protection of species, to a testimony of an Argentinian biologist in front of a congressional committee about the legalisation of abortion (which failed at the level of the Agentinian senate later this month), to a DNA-like version of neural network, to Louis Chen from NUS being mentioned in a career article about the importance of planning well in advance one’s retirement to preserve academia links and manage a new position or even career. Which is what happened to Louis as he stayed head of NUS after the mandatory retirement age and is now emeritus and still engaged into research. (The article made me wonder however how the cases therein had be selected.) It is actually most revealing to see how different countries approach the question of retirements of academics: in France, for instance, one is essentially forced to retire and, while there exist emeritus positions, it is extremely difficult to find funding.

“Louis Chen was technically meant to retire in 2005. The mathematician at the National University of Singapore was turning 65, the university’s official retirement age. But he was only five years into his tenure as director of the university’s new Institute for Mathematical Sciences, and the university wanted him to stay on. So he remained for seven more years, stepping down in 2012. Over the next 18 months, he travelled and had knee surgery, before returning in summer 2014 to teach graduate courses for a year.”

And [yet] another piece on the biases of AIs. Reproducing earlier papers discussed here, with one obvious reason being that the learning corpus is not representative of the whole population, maybe survey sampling should become compulsory in machine learning training degrees. And yet another piece on why protectionism is (also) bad for the environment.

un lagarto en las Cataratas del Iguazú [guest jatp]

Posted in Kids, pictures, Travel with tags , , , , , on January 10, 2017 by xi'an

lizar

Statistics may be harmful to your freedom

Posted in Statistics with tags , , , , , on January 29, 2013 by xi'an

On Wednesday, I was reading the freshly delivered Significance and esp. the several papers therein about statisticians being indicted, fired, or otherwise sued for doing statistics. I mentioned a while ago the possible interpretations of L’Aquila verdict (where I do not know whether any of the six scientists is a statistician), but did not know about Graciela Bevacqua‘s hardship in the Argentinian National Statistics Institute, nor about David Nutt being sacked from the Advisory Council on the Misuse of Drugs, nor about Peter Wilmshurst being sued by NMT (a US medical device corporation) for expressing concern about a clinical trial they conducted. What is most frightening in those stories is that those persons ended up facing those hardships without any support from their respective institutions (quite the opposite in two cases!). And then, on the way home, I further read that the former head of the Greek National Statistics Institute (Elstat) was fired and indicted for over-estimating the Greek deficit, after resisting official pressure to lower it down…  Tough job!

AMOR at 5000ft in a water tank…

Posted in Mountains, pictures, Statistics, University life with tags , , , , , , , , , , , , , , on November 22, 2012 by xi'an

On Monday, I attended the thesis defence of Rémi Bardenet in Orsay as a member (referee) of his thesis committee. While this was a thesis in computer science, which took place in the Linear Accelerator Lab in Orsay, it was clearly rooted in computational statistics, hence justifying my presence in the committee. The justification (!) for the splashy headline of this post is that Rémi’s work was motivated by the Pierre-Auger experiment on ultra-high-energy cosmic rays, where particles are detected through a network of 1600 water tanks spread over the Argentinian Pampa Amarilla on an area the size of Rhode Island (where I am incidentally going next week).

The part of Rémi’s thesis presented during the defence concentrated on his AMOR algorithm, arXived in a paper written with Olivier Cappé and Gersende Fort. AMOR stands for adaptive Metropolis online relabelling and combines adaptive MCMC techniques with relabelling strategies to fight label-switching (e.g., in mixtures). I have been interested in mixtures for eons (starting in 1987 in Ottawa with applying Titterington, Smith, and Makov to chest radiographs) and in label switching for ages (starting at the COMPSTAT conférence in Bristol in 1998). Rémi’s approach to the label switching problem follows the relabelling path, namely a projection of the original parameter space into a smaller subspace (that is also a quotient space) to avoid permutation invariance and lack of identifiability. (In the survey I wrote with Kate Lee, Jean-Michel Marin and Kerrie Mengersen, we suggest using the mode as a pivot to determine which permutation to use on the components of the mixture.) The paper suggests using an Euclidean distance to a mean determined adaptively, μt, with a quadratic form Σt also determined on-the-go, minimising (Pθ-μt)TΣt(Pθ-μt) over all permutations P at each step of the algorithm. The intuition behind the method is that the posterior over the restricted space should look like a roughly elliptically symmetric distribution, or at least like a unimodal distribution, rather than borrowing bits and pieces from different modes. While I appreciate the technical tour de force represented by the proof of convergence of the AMOR algorithm, I remain somehow sceptical about the approach and voiced the following objections during the defence: first, the assumption that the posterior becomes unimodal under an appropriate restriction is not necessarily realistic. Secondary modes often pop in with real data (as in the counter-example we used in our paper with Alessandra Iacobucci and Jean-Michel Marin). Next, the whole apparatus of fighting multiple modes and non-identifiability, i.e. fighting label switching, is to fall back on posterior means as Bayes estimators. As stressed in our JASA paper with Gilles Celeux and Merrilee Hurn, there is no reason for doing so and there are several reasons for not doing so:

  • it breaks down under model specification, i.e., when the number of components is not correct
  • it does not improve the speed of convergence but, on the opposite, restricts the space visited by the Markov chain
  • it may fall victim to the fatal attraction of secondary modes by fitting too small an ellipse around one of those modes
  • it ultimately depends on the parameterisation of the model
  • there is no reason for using posterior means in mixture problems, posterior modes or cluster centres can be used instead

I am therefore very much more in favour of producing a posterior distribution that is as label switching as possible (since the true posterior is completely symmetric in this respect). Post-processing the resulting sample can be done by using off-the-shelf clustering in the component space, derived from the point process representation used by Matthew Stephens in his thesis and subsequent papers. It also allows for a direct estimation of the number of components.

In any case, this was a defence worth-attending that led me to think afresh about the label switching problem, with directions worth exploring next month while Kate Lee is visiting from Auckland. Rémi Bardenet is now headed for a postdoc in Oxford, a perfect location to discuss further label switching and to engage into new computational statistics research!