Archive for Nature

statistics snapshots from Nature

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , on November 6, 2016 by xi'an

Two snapshots from the October 27 issue of Nature, one reporting on David Cox receiving the first “Nobel in Statistics” Prize and one about the @ScientistTrump parody site, where being Bayesian sounds like a slur..!

Nature highlights

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , on November 1, 2016 by xi'an

A mostly genetics issue of Nature this week (of October 13), as the journal contains an article on the genomes of 300 individuals from 142 diverse populations across the globe, and another one on the genetic history of Australia Aborigines, plus a third one of 483 individuals from 125 populations drawing genetic space barriers, leading to diverging opinions on the single versus multiple out-of-Africa scenario. As some of these papers are based on likelihood-based techniques, I wish I had more time to explore the statistics behind. Another paper builds a phylogeny of violence in mammals, rising as one nears the primates. I find the paper most interesting but I am not convinced by the genetic explanation of violence, in particular because it seems hard to believe that data about Palaeolithic, Mesolithic, and Neolithic periods can be that informative about the death rate due to intra-species violence. And to conclude on a “pessimistic” note, the paper that argues there is a maximum lifespan for humans, meaning that the 122 years enjoyed (?) by Jeanne Calment from France may remain a limit. However, the argument seems to be that the observed largest, second largest, &tc., ages at death reached a peak in 1997, the year Jeanne Calment died, and is declining since then. That does not sound super-convincing when considering extreme value theory, since 1997 is the extreme event and thus another extreme event of a similar magnitude is not going to happen immediately after.

machines learning but not teaching…

Posted in Books, pictures with tags , , , , , , , on October 28, 2016 by xi'an

A few weeks after the editorial “Algorithms and Blues“, Nature offers another (general public) entry on AIs and their impact on society, entitled “The Black Box of AI“. The call is less on open source AIs and more on accountability, namely the fact that decisions produced by AIS and impacting people one way or another should be accountable. Rather than excused by the way out “the computer said so”. What the article exposes is how (close to) impossible this is when the algorithms are based on black-box structures like neural networks and other deep-learning algorithms. While optimised to predict as accurately as possible one outcome given a vector of inputs, hence learning in that way how the inputs impact this output [in the same range of values], these methods do not learn in a more profound way in that they very rarely explain why the output occurs given the inputs. Hence, given a neural network that predicts go moves or operates a self-driving car, there is a priori no knowledge to be gathered from this network about the general rules of how humans play go or drive cars. This rather obvious feature means that algorithms that determine the severity of a sentence cannot be argued as being rational and hence should not be used per se (or that the judicial system exploiting them should be sued). The article is not particularly deep (learning), but it mentions a few machine-learning players like Pierre Baldi, Zoubin Ghahramani and Stéphane Mallat, who comments on the distance existing between those networks and true (and transparent) explanations. And on the fact that the human brain itself goes mostly unexplained. [I did not know I could include such dynamic images on WordPress!]

To predict and serve?

Posted in Books, pictures, Statistics with tags , , , , , , , , , , , on October 25, 2016 by xi'an

Kristian Lum and William Isaac published a paper in Significance last week [with the above title] about predictive policing systems used in the USA and presumably in other countries to predict future crimes [and therefore prevent them]. This sounds like a good idea for a science fiction plot, à la Philip K Dick [in his short story, The Minority Report], but that it is used in real life definitely sounds frightening, especially when the civil rights of the targeted individuals are impacted. (Although some politicians in different democratic countries increasingly show increasing contempt for keeping everyone’ rights equal…) I also feel terrified by the social determinism behind the very concept of predicting crime from socio-economic data (and possibly genetic characteristics in a near future, bringing us back to the dark days of physiognomy!)

“…crimes that occur in locations frequented by police are more likely to appear in the database simply because that is where the police are patrolling.”

Kristian and William examine in this paper one statistical aspect of the police forces relying on crime prediction software, namely the bias in the data exploited by the software and in the resulting policing. (While the accountability of the police actions when induced by such software is not explored, this is obviously related to the Nature editorial of last week, “Algorithm and blues“, which [in short] calls for watchdogs on AIs and decision algorithms.) When the data is gathered from police and justice records, any bias in checks, arrests, and condemnations will be reproduced in the data and hence will repeat the bias in targeting potential criminals. As aptly put by the authors, the resulting machine learning algorithm will be “predicting future policing, not future crime.” Worse, by having no reservation about over-fitting [the more predicted crimes the better], it will increase the bias in the same direction. In the Oakland drug-user example analysed in the article, the police concentrates almost uniquely on a few grid squares of the city, resulting into the above self-predicting fallacy. However, I do not see much hope in using other surveys and datasets towards eliminating this bias, as they also carry their own shortcomings. Even without biases, predicting crimes at the individual level just seems a bad idea, for statistical and ethical reasons.

Nature highlights

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , , , , , , , on October 16, 2016 by xi'an

Among several interesting (general public) entries and the fascinating article reconstituting the death of Lucy by a fall from a tree, I spotted in the current Sept. 22 issue of Nature two short summaries involving statistical significance, one in linguistics about repeated (and significant) links between some sounds and some concepts (like ‘n’ and ‘nose’) shared between independent languages, another about the (significant) discovery of a π meson and a K meson. The first anonymous editorial, entitled “Algorithm and blues“, was rather gloomy about the impact of proprietary algorithms on our daily life and on our democracies (or what is left of them), like the reliance on such algorithms to grant loan or determining the length of a sentence (based on the estimated probability of re-offending). The article called for more accountability of such tools, from going completely open-source to allowing for some form of strong auditing. This reminded me of the current (regional) debate about the algorithm allocating Greater Paris high school students to local universities and colleges based on their grades, wishes, and available positions. The apparent randomness and arbitrariness of those allocations prompted many (parents) to complain about the algorithm and ask for its move to the open. (Besides the pun in the title, the paper also contained a line about “affirmative algorithmic action”!) There was also a perfectly irrelevant tribune from a representative of the Church of England about its desire to give a higher profile to science in the/their church. Whatever. And I also was bemused by a news article on the difficulty to build a genetic map of Australia Aboriginals due to cultural reticence of Aboriginals to the use of body parts from their communities in genetic research. While I understand and agree with the concept of data privacy, so that to restrain to expose personal information, it is much less clear [to me] why data collected a century ago should come under such protections if it does not create a risk of exposing living individuals. It reminded me of this earlier Nature news article about North-America Aboriginals claiming right to a 8,000 year old skeleton. On a more positive side, this news part also mentioned the first catalogue produced by the Gaia European Space Agency project, from the publication of more than a billion star positions to the open access nature of the database, in that the Gaia team had hardly any prior access to such wealth of data. A special issue part of the journal was dedicated to the impact of social inequalities in the production of (future) scientists, but this sounds rather shallow, at least at the level of the few pages produced on the topic and it did not mention a comparison with other areas of society, where they are also most obviously at work!

snapshots from Nature

Posted in Books, Kids, pictures, University life with tags , , , , , , , , , , on September 19, 2016 by xi'an

Among many interesting things I read from the pile of Nature issues that had accumulated over a month of travelling, with a warning these are mostly “old” news by now!:

  • the very special and untouched case of Cuba in terms of the Zika epidemics, thanks to a long term policy fighting mosquitoes at all levels of the society;
  • an impressive map of the human cortex, which statistical analysis would be fascinating;
  • an excerpt from Nature 13 August 1966 where the Poisson distribution was said to describe the distribution of scores during the 1966 World Cup;
  • an analysis of a genetic experiment on evolution involving 50,000 generations (!) of Escherichia coli;
  • a look back at the great novel Flowers for Algernon, novel I read eons ago;
  • a Nature paper on the first soft robot, or octobot, along with some easier introduction, which did not tell which kind of operations could be accomplished by such a robot;
  • a vignette on a Science paper about the interaction between honey hunters and hunting birds, which I also heard depicted on the French National Radio, with an experiment comparing the actual hunting (human) song, a basic sentence in the local language, and the imitation of the song of another bird. I could not understand why the experiment did not include hunting songs from other hunting groups, as they are highly different but just as effective. It would have helped in understanding how innate the reaction of the bird is;
  • another literary entry at the science behind Mary Shelley’s Frankenstein;
  • a study of the Mathematical Genealogy Project in terms of the few mathematicians who started most genealogies of mathematicians, including d’Alembert, advisor to Laplace of whom I am one of the many descendants, although the finding is not that astounding when considering usual genealogies where most branches die off and the highly hierarchical structure of power in universities of old.

Darwin’s radio [book review]

Posted in Books, Kids, pictures, University life with tags , , , , , , , , , , , , , , , , on September 10, 2016 by xi'an

When in Sacramento two weeks ago I came across the Beers Books Center bookstore, with a large collection of used and (nearly) new cheap books and among other books I bought Greg Bear’s Darwin Radio. I had (rather) enjoyed another book of his’, Hull Zero Three, not to mention one of his first books, Blood Music, I read in the mid 1980’s, and the premises of this novel sounded promising, not mentioning the Nebula award. The theme is of a major biological threat, apparently due to a new virus, and of the scientific unraveling of what the threat really means. (Spoilers alert!) In that respect it sounds rather similar to the (great) Crichton‘s The Andromeda Strain, which is actually mentioned by some characters in this book. As is Ebola, as a sort of contrapoint (since Ebola is a deadly virus, although the epidemic in Western Africa now seems to have vanished). The biological concept exploited here is dormant DNA in non-coding parts of the genome that periodically get awaken and induce massive steps in the evolution. So massive that carriers of those mutations are killed by locals. Until the day it happens in an all-connected World and the mutation can no longer be stopped. The concept is compelling if not completely convincing of course, while the outcome of a new human race, which is to Homo Sapiens what Homo Sapiens was to Neanderthal, is rather disappointing. (How could it be otherwise?!) But I did appreciate the postulate of a massive and immediate change in the genome, even though the details were disputable and the dismissal of Dawkins‘ perspective poorly defended. From a stylistic perspective, the style is at time heavy, while there are too many chance occurrences, like the main character happening to be in Georgia for a business deal (spoilers, spoilers!) at the times of the opening of collective graves, or the second main character coming upon a couple of Neanderthal mummies with a Sapiens baby, or yet this pair of main characters falling in love and delivering a live mutant baby-girl. But I enjoyed reading it between San Francisco and Melbourne, with a few hours of lost sleep and work. It is a page turner, no doubt! I also like the political undercurrents, from riots to emergency measures, to an effective dictatorship controlling pregnancies and detaining newborns and their mothers.

One important thread in the book deals with anthropology digs getting against Native claims to corpses and general opposition to such digs. This reminded me of a very recent article in Nature where a local Indian tribe had claimed rights to several thousand year old skeletons, whose DNA was then showed to be more related with far away groups than the claimants. But where the tribe was still granted the last word, in a rather worrying jurisprudence.