Archive for Nature

aftermaths of retiring significance

Posted in Books, pictures, Statistics, University life with tags , , , , , , on April 10, 2019 by xi'an

Beyond mentions in the general press of the retire significance paper, as in Retraction Watch, Bloomberg, The Guardian, Vox, and NPR, not to mention the large number of comments on Andrew’s blog, and Deborah Mayo’s tribune on a ban on free speech (!), Nature of “the week after” contained three letters from Ioannidis, calling for more stringent thresholds, Johnson, essentially if unclearly stating the same, and my friends from Amsterdam, Alexander Ly and E.J. Wagenmakers, along with Julia Haaf, getting back to the Great Old Ones, to defend the usefulness of testing versus estimation.

the joy of stats [book review]

Posted in Books, pictures, University life with tags , , , , , , , , , , , , on April 8, 2019 by xi'an

David Spiegelhalter‘s latest book, The Art of Statistics: How to Learn from Data, has made it to Nature Book Review main entry this week. Under the title “the joy of stats”,  written by Evelyn Lamb, a freelance math and science writer from Salt Lake City, Utah. (I noticed that the book made it to Amazon #1 bestseller, albeit in the Craps category!, which I am unsure is completely adequate!, especially since the book is not yet for sale on the US branch of Amazon!, and further Amazon #1 in the Probability and Statistics category in the UK.) I have not read the book yet and here are a few excerpts from the review, quoted verbatim:

“The book is part of a trend in statistics education towards emphasizing conceptual understanding rather than computational fluency. Statistics software can now perform a battery of tests and crunch any measure from large data sets in the blink of an eye. Thus, being able to compute the standard deviation of a sample the long way is seen as less essential than understanding how to design and interpret scientific studies with a rigorous eye.”

“…a main takeaway from the book is a sense of circumspection about our confidence in what is known. As Spiegelhalter writes, the point of statistical science is to ease us through the stages of extrapolation from a controlled study to an understanding of the real world, `and finally, with due humility, be able to say what we can and cannot learn from data’. That humility can be lacking when statistics are used in debates about contentious issues such as the costs and benefits of cancer screening.

abandon ship [value]!!!

Posted in Books, Statistics, University life with tags , , , , , , , , , on March 22, 2019 by xi'an

The Abandon Statistical Significance paper we wrote with Blakeley B. McShane, David Gal, Andrew Gelman, and Jennifer L. Tackett has now appeared in a special issue of The American Statistician, “Statistical Inference in the 21st Century: A World Beyond p < 0.05“.  A 400 page special issue with 43 papers available on-line and open-source! Food for thought likely to be discussed further here (and elsewhere). The paper and the ideas within have been discussed quite a lot on Andrew’s blog and I will not repeat them here, simply quoting from the conclusion of the paper

In this article, we have proposed to abandon statistical significance and offered recommendations for how this can be implemented in the scientific publication process as well as in statistical decision making more broadly. We reiterate that we have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors.

Which also introduced in a comment by Valentin Amrhein, Sander Greenland, and Blake McShane published in Nature today (and supported by 800+ signatures). Again discussed on Andrew’s blog.

Nature tea[dbits]

Posted in Books, pictures, University life, Wines with tags , , , , , , , , , , , , , , , , on February 28, 2019 by xi'an

A very special issue of Nature (7 February 2019, vol. 556, no. 7742). With an outlook section on tea, plus a few research papers (and ads) on my principal beverage. News about the REF, Elsevier’s and Huawei’s woes with the University of California, the dangerous weakening of Title IX by the Trump administration, and a long report on the statistical analysis of Hurricane Maria deaths, involving mostly epidemiologists, but also Patrick Ball who took part in our Bayes for Good workshop at CIRM. Plus China’s food crisis and ways to reduce cropland losses and food waste. Concerning the tea part(y), a philogenetic study of different samples led to the theory that tea was domesticated thrice, twice in Yunnan (China) and once in Assam (India), with a divergence estimated at more than twenty thousand years ago. Another article on Pu-Ehr, with the potential impacts of climate change on this very unique tea. With a further remark that higher altitudes increase the anti-oxydant level of tea… And a fascinating description of agro-forestry where tea and vegetables are grown in a forest that regulates sun exposure, moisture evaporation, and soil nutrients.

how a hiring quota failed [or not]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on February 26, 2019 by xi'an

This week, Nature has a “career news” section dedicated to how hiring quotas [may have] failed for French university hiring. And based solely on a technical report by a Science Po’ Paris researcher. The hiring quota means that every hiring committee for a French public university hiring committee must be made of at least 40% members of each gender.  (Plus at least 50% of external members.) Which has been reduced to 30% in some severely imbalanced fields like mathematics. The main conclusion of the report is that the reform has had a negative impact on the hiring imbalance between men and women in French universities, with “the higher the share of women in a committee, the lower women are ranked” (p.2). As head of the hiring board in maths at Dauphine, which officiates as a secretarial committee for assembling all hiring committee, I was interested in the reasons for this perceived impact, as I had not observed it at my [first order remote] level. As a warning the discussion that follows makes little sense without a prior glance at the paper.

“Deschamps estimated that without the reform, 21 men and 12 women would have been hired in the field of mathematics. But with the reform, committees whose membership met the quota hired 30 men and 3 women” Nature

Skipping the non-quantitative and somewhat ideological part of the report, as well as descriptive statistics, I looked mostly at the modelling behind the conclusions, as reported for instance in the above definite statement in Nature. Starting with a collection of assumptions and simplifications. A first dubious such assumption is that fields and even less universities where the more than 40% quota was already existing before (the 2015 reform) could be used as “control groups”, given the huge potential for confounders, especially the huge imbalance in female-to-male ratios in diverse fields. Second, the data only covers hiring histories for three French universities (out of 63 total) over the years 2009-2018 and furthermore merges assistant (Maître de Conférence) and full professors, where hiring is de facto much more involved, with often one candidate being contacted [prior to the official advertising of the position] by the department as an expression of interest (or the reverse). Third, the remark that

“there are no significant differences between the percentage of women who apply and those who are hired” (p.9)

seems to make the all discussion moot… and contradict both the conclusion and the above assertion! Fourth, the candidate’s qualification (or quality) is equated with the h-index, which is highly reductive and, once again, open to considerable biases in terms of seniority degree and of field. Depending on the publication lag and also the percentage of publications in English versus the vernacular in the given field. And the type of publications (from an average of 2.94 in business to 9.96 on physics]. Fifth, the report equates academic connections [that may bias the ranking] with having the supervisor present in the hiring committee [which sounds like a clear conflict of interest] or the candidate applying in the [same] university that delivered his or her PhD. Missing a myriad of other connections that make committee members often prone to impact the ranking by reporting facts from outside the application form.

“…controlling for field fixed effects and connections make the coefficient [of the percentage of women in the committee] statistically insignificant, though the point estimate remains high.” (p.17)

The models used by Pierre Deschamps are multivariate logit and probit regressions, where each jury attaches a utility to each of its candidates, made of a qualification term [for the position] and of a gender bias most surprisingly multiplying candidate gender and jury gender dummies. The qualification term is expressed as a [jury free] linear regression on covariates plus a jury fixed effect. Plus an error distributed as a Gumbel extreme variate that leads to a closed-form likelihood [and this seems to be the only reason for picking this highly skewed distribution]. The probit model is used to model the probability that one candidate has a better utility than another. The main issue with this modelling is the agglomeration of independence assumptions, as (i) candidates and hired ones are not independent, from being evaluated over several positions all at once, with earlier selections and rankings all public, to having to rank themselves all the positions where they are eligible, to possibly being co-authors of other candidates; (ii) jurys are not independent either, as the limited pool of external members, esp. in gender-imbalanced fields, means that the same faculty often ends up in several jurys at once and hence evaluates the same candidates as a result, plus decides on local ranking in connection with earlier rankings; (iii) independence between several jurys of the same university when this university may try to impose a certain if unofficial gender quota, a variate obviously impossible to fill . Plus again a unique modelling across disciplines. A side but not solely technical remark is that among the covariates used to predict ranking or first position for a female candidate, the percentage of female candidates appears, while being exogenous. Again, using a univariate probit to predict the probability that a candidate is ranked first ignores the comparison between a dozen candidates, both male and female, operated by the jury. Overall, I find little reason to give (significant) weight to the indicator that the president is a woman in the logistic regression and even less to believe that a better gender balance in the jurys has led to a worse gender balance in the hirings. From one model to the next the coefficients change from being significant to non-significant and, again, I find the definition of the control group fairly crude and unsatisfactory, if only because jurys move from one session to the next (and there is little reason to believe one field more gender biased than another, with everything else accounted for). And for another my own experience within hiring committees in Dauphine or elsewhere has never been one where the president strongly impacts the decision. If anything, the president is often more neutral (and never ever imoe makes use of the additional vote to break ties!)…

the future of conferences

Posted in Books, Kids, pictures, Travel, University life with tags , , , , , , , , , , , , , on January 22, 2019 by xi'an

The last issue of Nature for 2018 offers a stunning collection of science photographs, ten portraits of people who mattered (for the editorial board of Nature), and a collection of journalists’ entries on scientific conferences. The later point leading to interesting questioning on the future of conferences, some of which relate to earlier entries on this blog. Like attempts to make them having a lesser carbon footprint, by only attending focused conferences and workshops, warning about predatory ones, creating local hives on different continents that can partake of all talks but reduce travel and size and still allow for exchanges person to person, multiply the meetings and opportunities around a major conference to induce “only” one major trip (as in the past summer of British conferences, or the incoming geographical combination of BNP and O’Bayes 2019), cut the traditional dreary succession of short talks in parallel in favour of “unconferences” where participants set communally the themes and  structure of the meeting (but ware the dangers of bias brought by language, culture, seniority!). Of course, this move towards new formats will meet opposition from several corners, including administrators who too often see conferences as a pretense for paid vacations and refuse supporting costs without a “concrete” proof of work in the form of a presentation.Another aspect of conference was discussed there, namely the art of delivering great talks. Which is indeed more an art than a science, since the impact will not only depend on the speaker and the slides, but also on the audience and the circumstances. As years pile on, I am getting less stressed and probably too relaxed about giving talks, but still rarely feel I have reached toward enough of the audience. And still falling too easily for the infodump mistake… Which reminds me of a recent column in Significance (although I cannot link to it!), complaining about “finding it hard or impossible to follow many presentations, particularly those that involved a large number of equations.” Which sounds strange to me as on the opposite I quickly loose track in talks with no equations. And as mathematical statistics or probability issues seems to imply the use of maths symbols and equations. (This reminded me of a short course I gave once in a undisclosed location, where a portion of the audience left after the first morning, due to my use of “too many Greek letters”.) Actually, I am always annoyed at apologies for using proper maths notations, since they are the tools of our trade.Another entry of importance in this issue of Nature is an interview with Katherine Heller and Hal Daumé, as first chairs for diversity and inclusion at N[eur]IPS. Where they discuss the actions taken since the previous NIPS 2017 meeting to address the lack of inclusiveness and the harassment cases exposed there, first by Kristian Lum, Lead Statistician at the Human Rights Data Analysis Group (HRDAG), whose blog denunciation set the wheels turning towards a safer and better environment (in stats as well as machine-learning). This included the [last minute] move towards renaming the conference as NeuroIPS to avoid sexual puns on the former acronym (which as a non-native speaker I missed until it was pointed out to me!). Judging from the feedback it seems that the wheels have indeed turned a significant amount and hopefully will continue its progress.

statistics in Nature [a tale of the two Steves]

Posted in Books, pictures, Statistics with tags , , , , , , , , , on January 15, 2019 by xi'an

In the 29 November issue of Nature, Stephen Senn (formerly at Glasgow) wrote an article about the pitfalls of personalized medicine, for the statistics behind the reasoning are flawed.

“What I take issue with is the de facto assumption that the differential response to a drug is consistent for each individual, predictable and based on some stable property, such as a yet-to-be-discovered genetic variant.”S. Senn

One (striking) reason being that the studies rest on a sort of low-level determinism that does not account for many sources of variability. Over-confidence in causality results. Stephen argues that improvement lies in insisting on repeated experiments on the same subjects (with an increased challenge in modelling since this requires longitudinal models with dependent observations). And to “drop the use of dichotomies”, favouring instead continuous modeling of measurements.

And in the 6 December issue, Steven Goodman calls (in the World view tribune) for probability statements to be attached as confidence indices to scientific claims. That he takes great pain to distinguish from p-values and links with Bayesian analysis. (Bayesian analysis that Stephen regularly objects to.) While I applaud the call, I am quite pessimistic about the follow-up it will generate, the primary reply being that posterior probabilities can be manipulated as well as p-values. And that Bayesian probabilities are not “real” probabilities (dixit Don Fraser or Deborah Mayo).