Archive for Statistics

Statistics versus Data Science [or not]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on October 13, 2017 by xi'an

Last week a colleague from Warwick forwarded us a short argumentation by Donald Macnaughton (a “Toronto-based statistician”) about switching the name of our field from Statistics to Data Science. This is not the first time I hear of this proposal and this is not the first time I express my strong disagreement with it! Here are the naughtonian arguments

  1. Statistics is (at least in the English language) endowed with several meanings from the compilation of numbers out of a series of observations to the field, to the procedures proposed by the field. This is argued to be confusing for laypeople. And missing the connection with data at the core of our field. As well as the indication that statistics gathers information from the data. Data science seems to convey both ideas… But it is equally vague in that most scientific fields if not all rely on data and observations and the structure exploitation of such data. Actually a lot of so-called “data-scientists” have specialised in the analysis of data from their original field, without voluntarily embarking upon a career of data-scientist. And not necessarily acquiring the proper tools for incorporating uncertainty quantification (aka statistics!).
  2. Statistics sounds old-fashioned and “old-guard” and “inward-looking” and unattractive to young talents, while they flock to Data Science programs. Which is true [that they flock] but does not mean we [as a field] must flock there as well. In five or ten years, who can tell this attraction of data science(s) will still be that strong. We already had to switch our Master names to Data Science or the like, this is surely more than enough.
  3. Data science is encompassing other areas of science, like computer science and operation research, but this is not an issue both in terms of potential collaborations and gaining the upper ground as a “key part” in the field. Which is more wishful thinking than a certainty, given the existing difficulties in being recognised as a major actor in data analysis. (As for instance in a recent grant evaluation in “Big Data” where the evaluation committee involved no statistician. And where we got rejected.)

peer community in evolutionary biology

Posted in Statistics with tags , , , , , , , on May 18, 2017 by xi'an

My friends (and co-authors) from Montpellier pointed out the existence of PCI Evolutionary Biology, which is a preprint and postprint validation forum [so far only] in the field of Evolutionary Biology. Authors of a preprint or of a published paper request a recommendation from the forum. If someone from the board finds the paper of interest, this person initiates a quick refereeing process with one or two referees and returns a review to the authors, with possible requests for modification, and if the lead reviewer is happy with the new version, the link to the paper and the reviews are published on PCI Evol Biol, which thus gives a stamp of validation to the contents in the paper. The paper can then be submitted for publication in any journal, as can be seen from the papers in the list.

This sounds like a great initiative and since PCI is calling for little brothers and sisters to PCI Evol Biol, I think we should try to build its equivalent in Statistics or maybe just Computational Statistics.

Shadows of Self [book review]

Posted in Books, Kids, Statistics, Travel with tags , , , , , , , on April 8, 2017 by xi'an

“He’d always found it odd that so many died when they were old, as logic said that was the point in their lives when they’d the most practice not dying.”

Now this is steampunk fantasy, definitely! With little novelty in the setting of the universe. If mixed with a Wild West feeling, though, just like the half-made World

“Mirabell had been a statistician and psychologist in the third century who had studied why some people worked harder than others.”

Actually, this is the same universe as The Mistborn trilogy, but 300 years later,which allows for some self-referential jokes and satire. Including the notion that the current ruling class could be exactly what the heroes of The Mistborn had fought against!

“Not guns,” Wayne said with a grin. “A different kind of weapon. Math.”

More precisely, this is the (a?) sequel to the Alloy of Law, which I had almost completely forgotten, unlike The Mistborn trilogy, which does not help with the reading as the book refers rather insistently to this Alloy of Law!

“Sir, you said you hired me in part because of my ability to read statistics.”

Nonetheless, it is an interesting plot, with a very nice ambiguity of the main characters, who (again) often feel they may be closer to the dictature that set The Mistborn revolution than to the revolutionaries themselves! And one of the heroes is a statistician (as obvious from the many quotes around!).

“Wayne felt a disturbance stir within him, like his stomach discovering  he’d just fed it a bunch of rotten apples. Religion worried him. It could ask men to do things they’d otherwise never do.”

In short, good story, nice style, entertaining dialogues: perfect [mind-candy] travel novel!

weapons of math destruction [book review]

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , on December 15, 2016 by xi'an

wmd As I had read many comments and reviews about this book, including one by Arthur Charpentier, on Freakonometrics, I eventually decided to buy it from my Amazon Associate savings (!). With a strong a priori bias, I am afraid, gathered from reading some excerpts, comments, and the overall advertising about it. And also because the book reminded me of another quantic swan. Not to mention the title. After reading it, I am afraid I cannot tell my ascertainment has changed much.

“Models are opinions embedded in mathematics.” (p.21)

The core message of this book is that the use of algorithms and AI methods to evaluate and rank people is unsatisfactory and unfair. From predicting recidivism to fire high school teachers, from rejecting loan applications to enticing the most challenged categories to enlist for for-profit colleges. Which is indeed unsatisfactory and unfair. Just like using the h index and citation ranking for promotion or hiring. (The book mentions the controversial hiring of many adjunct faculty by KAU to boost its ranking.) But this conclusion is not enough of an argument to write a whole book. Or even to blame mathematics for the unfairness: as far as I can tell, mathematics has nothing to do with unfairness. Some analysts crunch numbers, produce a score, and then managers make poor decisions. The use of mathematics throughout the book is thus completely inappropriate, when the author means statistics, machine learning, data mining, predictive algorithms, neural networks, &tc. (OK, there is a small section on Operations Research on p.127, but I figure deep learning can bypass the maths.) Continue reading

post-doctoral position in Paris

Posted in Statistics, Travel, University life with tags , , , , on October 14, 2016 by xi'an

The Fondation Sciences Mathématiques de Paris (FSMP) is lauching a call for postdoctoral positions in mathematics (incl. statistics!) and in fundamental computer science in the main laboratories of Paris universities for the academic year 2017-2018. The call for applications is open until December 1st 2016, 11:59 (pm), Paris time.

position opening at ENSAE ParisTech

Posted in Kids, Statistics, Travel, University life with tags , , , , , , , on March 28, 2016 by xi'an

ensaeprofParis and la Seine, from Pont du Garigliano, Oct. 20, 2011There is an opening for an associate or full professor position in Statistics and Machine Learning at ENSAE, Paris (soon to move to the Paris-Saclay campus, next to École Polytechnique). The details are provided here. The deadline is April 18, 2016, for a hiring in September or October 2016.

plenty of new arXivals!

Posted in Statistics, University life with tags , , , , , on October 2, 2014 by xi'an

Here are some entries I spotted in the past days as of potential interest, for which I will have not enough time to comment:

  • arXiv:1410.0163: Instrumental Variables: An Econometrician’s Perspective by Guido Imbens
  • arXiv:1410.0123: Deep Tempering by Guillaume Desjardins, Heng Luo, Aaron Courville, Yoshua Bengio
  • arXiv:1410.0255: Variance reduction for irreversible Langevin samplers and diffusion on graphs by Luc Rey-Bellet, Konstantinos Spiliopoulos
  • arXiv:1409.8502: Combining Particle MCMC with Rao-Blackwellized Monte Carlo Data Association for Parameter Estimation in Multiple Target Tracking by Juho Kokkala, Simo Särkkä
  • arXiv:1409.8185: Adaptive Low-Complexity Sequential Inference for Dirichlet Process Mixture Models by Theodoros Tsiligkaridis, Keith W. Forsythe
  • arXiv:1409.7986: Hypothesis testing for Markov chain Monte Carlo by Benjamin M. Gyori, Daniel Paulin
  • arXiv:1409.7672: Order-invariant prior specification in Bayesian factor analysis by Dennis Leung, Mathias Drton
  • arXiv:1409.7458: Beyond Maximum Likelihood: from Theory to Practice by Jiantao Jiao, Kartik Venkat, Yanjun Han, Tsachy Weissman
  • arXiv:1409.7419: Identifying the number of clusters in discrete mixture models by Cláudia Silvestre, Margarida G. M. S. Cardoso, Mário A. T. Figueiredo
  • arXiv:1409.7287: Identification of jump Markov linear models using particle filters by Andreas Svensson, Thomas B. Schön, Fredrik Lindsten
  • arXiv:1409.7074: Variational Pseudolikelihood for Regularized Ising Inference by Charles K. Fisher