Archive for Nature

Das Kapital [not a book review]

Posted in Statistics with tags , , , , , , , , , , , on August 18, 2017 by xi'an

A rather bland article by Gareth Stedman Jones in Nature reminded me that the first volume of Karl Marx’ Das Kapital is 150 years old this year. Which makes it appear quite close in historical terms [just before the Franco-German war of 1870] and rather remote in scientific terms. I remember going painstakingly through the books in 1982 and 1983, mostly during weekly train trips between Paris and Caen, and not getting much out of it! Even with the help of a cartoon introduction I had received as a 1982 Xmas gift! I had no difficulty in reading the text per se, as opposed to my attempt of Kant’s Critique of Pure Reason the previous summer [along with the other attempt to windsurf!], as the discourse was definitely grounded in economics and not in philosophy. But the heavy prose did not deliver a convincing theory of the evolution of capitalism [and of its ineluctable demise]. While the fundamental argument of workers’ labour being an essential balance to investors’ capital for profitable production was clearly if extensively stated, the extrapolations on diminishing profits associated with decreasing labour input [and the resulting collapse] were murkier and sounded more ideological than scientific. Not that I claim any competence in the matter: my attempts at getting the concepts behind Marxist economics stopped at this point and I have not been seriously thinking about it since! But it still seems to me that the theory did age very well, missing the increasing power of financial agents in running companies. And of course [unsurprisingly] the numerical revolution and its impact on the (des)organisation of work and the disintegration of proletariat as Marx envisioned it. For instance turning former workers into forced and poor entrepreneurs (Uber, anyone?!). Not that the working conditions are particularly rosy for many, from a scarsity of low-skill jobs, to a nurtured competition between workers for existing jobs (leading to extremes like the scandalous zero hour contracts!), to minimum wages turned useless by the fragmentation of the working space and the explosion of housing costs in major cities, to the hopelessness of social democracies to get back some leverage on international companies…

crowd-based peer review

Posted in Statistics with tags , , , , , , , , , , on June 20, 2017 by xi'an

In clear connection with my earlier post on Peer Community In… and my visit this week to Montpellier towards starting a Peer Community In Computational Statistics, I read a tribune in Nature (1 June, p.9) by the editor of Synlett, Benjamin List, describing an experiment conducted by this journal in chemical synthesis. The approach was to post (volunteered) submitted papers on a platform accessible to a list of 100 reviewers, nominated by the editorial board, who could anonymously comment on the papers and read others’ equally anonymous comments. With a 72 hours deadline! According to Benjamin List (and based on  a large dataset of … 10 papers!), the outcome of the experiment is one of better quality return than with traditional reviewing policies. While Peer Community In… does not work exactly this way, and does not aim at operating as a journal, it is exciting and encouraging to see such experiments unfold!

gerrymandering detection by MCMC

Posted in Books, Statistics with tags , , , , , , , on June 16, 2017 by xi'an

In the latest issue of Nature I read (June 8), there is a rather long feature article on mathematical (and statistical) ways of measuring gerrymandering, that is the manipulation of the delimitations of a voting district toward improving the chances of a certain party. (The name comes from Elbridge Gerry (1812) and the salamander shape of the district he created.) The difficulty covered by the article is about detecting gerrymandering, which leads to the challenging and almost philosophical question of defining a “fair” partition of a region into voting districts, when those are not geographically induced. Since each partition does not break the principles of “one person, one vote” and of majority rule. Having a candidate or party win at the global level and loose at every local level seems to go against this majority rule, but with electoral systems like in the US, this frequently happens (with dire consequences in the latest elections). Just another illustration of Simpson’s paradox, essentially. And a damning drawback of multi-tiered electoral systems.

“In order to change the district boundaries, we use a Markov Chain Monte Carlo algorithm to produce about 24,000 random but reasonable redistrictings.”

In the arXiv paper that led to this Nature article (along with other studies), Bagiat et al. essentially construct a tail probability to assess how extreme the current district partition is against a theoretical distribution of such partitions. Finding that the actual redistrictings of 2012 and 2016 in North Carolina are “extremely atypical”.  (The generation of random partitions obeyed four rules, namely equal population, geographic compacity and connexity, proximity to county boundaries, and a majority of Afro-American voters in at least two districts, the latest being a requirement in North Carolina. A score function was built by linear combination of four corresponding scores, mostly χ² like, and turned into a density, simulated annealing style. The determination of the final temperature β=1 (p.18) [or equivalently of the weights (p.20)] remains unclear to me. As does the use of more than 10⁵ simulated annealing iterations to produce a single partition (p.18)…

From a broader perspective, agreeing on a method to produce random district allocations could be the way to go towards solving the judicial dilemma in setting new voting maps as what is currently under discussion in the US.

the explanation why Science gets underfunded

Posted in Statistics with tags , , , , on May 8, 2017 by xi'an

Paris-Dauphine in Nature

Posted in Statistics with tags , , , , , , , , , on April 25, 2017 by xi'an

Since this is an event unlikely to occur that frequently, let me point out that Université Paris-Dauphine got a nominal mention in Nature of two weeks ago, through an article covering the recent Abel Prize of Yves Meyer and his work on wavelets through a collection of French institutions, including Paris-Dauphine where he was a professor in the maths department (CEREMADE) from 1985 till 1996. (Except for including a somewhat distantly related picture of an oscilloscope and a mention of the Higgs boson, the Nature article is quite nice!)

the incomprehensible challenge of poker

Posted in Statistics with tags , , , , , , , , on April 6, 2017 by xi'an

When reading in Nature about two deep learning algorithms winning at a version of poker within a few weeks of difference, I came back to my “usual” wonder about poker, as I cannot understand it as a game. (Although I can see the point, albeit dubious, in playing to win money.) And [definitely] correlatively do not understand the difficulty in building an AI that plays the game. [I know, I know nothing!]

no publication without confirmation

Posted in Books, Statistics, University life with tags , , , on March 15, 2017 by xi'an

“Our proposal is a new type of paper for animal studies (…) that incorporates an independent, statistically rigorous confirmation of a researcher’s central hypothesis.” (p.409)

A comment tribune in Nature of Feb 23, 2017, suggests running clinical trials in three stages towards meeting higher standards in statistical validation. The idea is to impose a preclinical trial run by an independent team following an initial research showing some potential for some new treatment. The three stages are thus (i) to generate hypotheses; (ii) to test hypotheses; (iii) to test broader application of hypotheses (p.410). While I am skeptical of the chances of this proposal reaching adoption (for various reasons, like, what would the incentive of the second team be [of the B team be?!], especially if the hypothesis is dis-proved, how would both teams share the authorship and presumably patenting rights of the final study?, and how could independence be certain were the B team contracted by the A team?), the statistical arguments put forward in the tribune are rather weak (in my opinion). Repeating experiments with a larger sample size and an hypothesis set a priori rather than cherry-picked is obviously positive, but moving from a p-value boundary of 0.05 to one of 0.01 and to a power of 80% is more a cosmetic than a foundational change. As Andrew and I pointed out in our PNAS discussion of Johnson two years ago.

“the earlier experiments would not need to be held to the same rigid standards.” (p.410)

The article contains a vignette on “the maths of predictive value” that makes intuitive sense but only superficially. First, “the positive predictive value is the probability that a positive result is truly positive” (p.411) A statement that implies a distribution of probability on the space of hypotheses, although I see no Bayesian hint throughout the paper. Second, this (ersatz of a) probability is computed by a ratio of the number of positive results under the hypothesis over the total number of positive results. Which does not make much sense outside a Bayesian framework and even then cannot be assessed experimentally or by simulation without defining a distribution of the output under both hypotheses. Simplistic pictures are the above are not necessarily meaningful. And Nature should certainly invest into a statistical editor!