Archive for NYT

on anonymisation

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on August 2, 2019 by xi'an

An article in the New York Times covering a recent publication in Nature Communications on the ability to identify 99.98% of Americans from almost any dataset with fifteen covariates. And mentioning the French approach of INSEE, more precisely CASD (a branch of GENES, as ENSAE and CREST to which I am affiliated), where my friend Antoine worked for a few years, and whose approach is to vet researchers who want access to non-anonymised data, by creating local working environments on the CASD machines  so that data does not leave the site. The approach is to provide the researcher with a dedicated interface, which “enables access remotely to a secure infrastructure where confidential data is safe from harm”. It further delivers reproducibility certificates for publications, a point apparently missed by the New York Times which advances the lack of reproducibility as a drawback of the method. It also mentions the possibility of doing cryptographic data analysis, again missing the finer details with a lame objection.

“Our paper shows how the likelihood of a specific individual to have been correctly re-identified can be estimated with high accuracy even when the anonymized dataset is heavily incomplete.”

The Nature paper is actually about the probability for an individual to be uniquely identified from the given dataset, which somewhat different from the NYT headlines. Using a copula for the distribution of the covariates. And assessing the model with a mean square error evaluation when what matters are false positives and false negatives. Note that the model need be trained for each new dataset, which reduces the appeal of the claim, especially when considering that individuals tagged as uniquely identified about 6% are not. The statistic of 99.98% posted in the NYT is actually a count on a specific dataset,  the 5% Public Use Microdata Sample files, and Massachusetts residents, and not a general statistic [which would not make much sense!, as I can easily imagine 15 useless covariates] or prediction from the authors’ model. And a wee bit anticlimactic.

The Long, Cruel History of the Anti-Abortion Crusade [reposted]

Posted in Books, Kids, Travel with tags , , , , , , , , on July 14, 2019 by xi'an

[Excerpts from an editorial in the NYT of John Irving, American author of the Cider House Rules novel we enjoyed reading 30 years ago]

“(…) I respect your personal reasons not to have an abortion — no one is forcing you to have one. I respect your choice. I’m pro-choice — often called pro-abortion by the anti-abortion crusaders, although no one is pro-abortion. What’s unequal about the argument is the choice; the difference between pro-life and pro-choice is the choice. Pro-life proponents have no qualms about forcing women to go through childbirth — they give women no choice (…)

I must remind the Roman Catholic Church of the First Amendment to the United States Constitution: “Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof.” In other words, we are free to practice the religion of our choice, and we are protected from having someone else’s religion practiced on us. Freedom of religion in the United States also means freedom from religion (…)

The prevailing impetus to oppose abortion is to punish the woman who doesn’t want the child. The sacralizing of the fetus is a ploy. How can “life” be sacred (and begin at six weeks, or at conception), if a child’s life isn’t sacred after it’s born? Clearly, a woman’s life is never sacred; as clearly, a woman has no reproductive rights (…)

Of an unmarried woman or girl who got pregnant, people of my grandparents’ generation used to say: “She is paying the piper.” Meaning, she deserves what she gets — namely, to give birth to a child. That cruelty is the abiding impetus behind the dishonestly named right-to-life movement. Pro-life always was (and remains) a marketing term. Whatever the anti-abortion crusaders call themselves, they don’t care what happens to an unwanted child — not after the child is born — and they’ve never cared about the mother.”

end of the game

Posted in Books, Kids, pictures with tags , , , , , , , , , , , , , , on May 28, 2019 by xi'an

While I have not watched a large part of the Game of Thrones episodes (apart from the first season I had time to follow while in the hospital), I decided to subscribe for one [free] month to OCS to get the last and final season [unlike a NYT critic who watches the entire eight seasons in five weeks!]. And witness how far it has diverged from the books, at least those already published. The first two episodes were unbearably slow and anti-climactic, the [mentionable] worst part being the endless discussion by a chimney fire of half a dozen of the main characters who would all be better sleeping. And the antagonism between Sansa and Daenerys sounding almost childish…The last battle in Winterfel was both fantastic and disappointing, fantastic in its scale and furia and impetus, a cinematographic feat!, possibly the best in the whole series, disappointing for the terrible military choices made by the best fighters in the seven kingdoms and beyond, for the disproportionate imbalance between the living and the dead, for the whole thing depending on the two seconds it took for the Ice King to shatter  [no longer a spoiler!], and for the absurd and lengthy scene of the zombies in the castle library. I just don’t like zombie movies as I find them a easy lazy plot element, especially when they can be resuscitated over and over… They have not yet appeared (on that scale) in the books and I hope they remain dead still! Some scenes are furthermore too reminiscent of video games, which cuts even deeper into the realism (!) of the battle. The scenario of the fourth episode is definitely botched and hurried, for the sudden and radical reversal of fortune being once again so much against basic military concepts (and basic physics as well!). Contrary to most reviews I read, maybe because I had little expectation about the characters in the show, I found the fifth episode quite impressive, in its vivid description of the sack of a city, the instantaneous switch from victorious to rapist and murderer, and the helplessness of those very few who wanted to stop the slaughter of the inhabitants. (By contrast, I found most of the individual scenes appalling, except for Arya’s which remains consistent with her parabola in the plot. So far. But we could have been spared the white horse in the end!) And then the last and final episode…! Which I definitely enjoyed, primarily for the bittersweet feeling this was the last hour spent with the (surviving) characters, even for the unrealistic developments and predictable conclusions, and the feeling that some scenes were made up in someone’s grand-father’s backyard, by the same someone’s teenage nephews… Although I was hoping for a glorious ending in line with the one of Monty Python and the Holy Grail… Alas, no police van, no delegation of bankers or lawyers showed up at the eleventh hour!

[Uninteresting coincidence: in this NYT pre-finale analysis, I read the very same sentence “Power resides where people believe it resides” pronounced by Mikhail Gorbachov in the daunting Chernobyl series which I watched a few hours earlier.]

abolitionist

Posted in Books with tags , , , , , , , , on May 4, 2019 by xi'an

A very interesting piece about prison abolition in the NYT. Centering on Ruth Wilson Gilmore an US advocate for the abolition of prison sentences and a geographer at Berkeley. Interesting because the very notion of abolition sounds anathema to many and I rarely meet people sharing the conviction that prison sentences are counter-productive, often in a major way. And not only at a philosophical (à la Foucault) or utopian (à la Thomas More) level, quite the opposite in that Gilmore also fight all the myths attached to incarcerated populations in the US, from the inmates being most non-violent drug traffickers to them being relatively innocent, to them being mostly black, to them providing cheap labour… The article also draw a convincing parallel between the sharp rise in incarceration and desindustrialisation in the 1970’s. (And also the rise in the incarceration rhetoric as a political campaign cheap argument.) And the way Gilmore (along with Angela Davis) involves the local communities against the building of new jails based on local needs rather than philosophical or ethical arguments… She clearly has an impact at this local level, but it is harder to see whether the society as a whole is moving towards different and more efficient and more productive ways of handling crime and violence.

how one journalist’s death provoked a backlash that thousands dead in Yemen did not

Posted in pictures with tags , , , , , , , on October 19, 2018 by xi'an

terrible graph, again

Posted in Statistics with tags , , , , on September 3, 2018 by xi'an

running shoes

Posted in Books, Running, Statistics with tags , , , , , , , , , , on August 12, 2018 by xi'an

A few days ago, when back from my morning run, I spotted a NYT article on Nike shoes that are supposed to bring on average a 4% gain in speed. Meaning for instance a 3 to 4 minute gain in a half-marathon.

“Using public race reports and shoe records from Strava, a fitness app that calls itself the social network for athletes, The Times found that runners in Vaporflys ran 3 to 4 percent faster than similar runners wearing other shoes, and more than 1 percent faster than the next-fastest racing shoe.”

What is interesting in this NYT article is that the two journalists who wrote it have analysed their own data, taken from Strava. Using a statistical model or models (linear regression? non-linear regression? neural net?) to predict the impact of the shoe make, against “all” other factors contributing to the overall time or position or percentage gain or yet something else. In most analyses produced in the NYT article, the 4% gain is reproduced (with a 2% gain for female shoe switcher and a 7% gain for slow runners).

“Of course, these observations do not constitute a randomized control trial. Runners choose to wear Vaporflys; they are not randomly assigned them. One statistical approach that seeks to address this uses something called propensity scores, which attempt to control for the likelihood that someone wears the shoes in the first place. We tried this, too. Our estimates didn’t change.”

The statistical analysis (or analyses) seems rather thorough, from what is reported in the NYT article, with several attempts at controlling for confounders. Still, the data itself is observational, even if providing a lot of variables to run the analyses, as it only covers runners using Strava (from 5% in Tokyo to 25% in London!) and indicating the type of shoes they wear during the race. There is also the issue that the shoes are quite expensive, at $250 a pair, especially if the effect wears out after 100 miles (this was not tested in the study), as I would hesitate to use them unless the race conditions look optimal (and they never do!). There is certainly a new shoes effect on top of that, between the real impact of a better response and a placebo effect. As shown by a similar effect of many other shoe makes. Hence, a moderating impact on the NYT conclusion that these Nike Vaporflys (flies?!) are an “outlier”. But nonetheless a fairly elaborate and careful statistical study that could potentially make it to a top journal like Annals of Applied Statistics!