## Elsevier in the frontline

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 27, 2017 by xi'an

“Viewed this way, the logo represents, in classical symbolism, the symbiotic relationship between publisher and scholar. The addition of the Non Solus inscription reinforces the message that publishers, like the elm tree, are needed to provide sturdy support for scholars, just as surely as scholars, the vine, are needed to produce fruit. Publishers and scholars cannot do it alone. They need each other. This remains as apt a representation of the relationship between Elsevier and its authors today – neither dependent, nor independent, but interdependent.”

There were two items of news related with the publishark Elsevier in the latest issue of Nature I read. One was that Germany, Peru, and Taiwan had no longer access to Elsevier journals, after negotiations or funding stopped. Meaning the scientists there have to find alternative ways to procure the papers, from the authors’ webpage [I do not get why authors fail to provide their papers through their publication webpage!] to peer-to-peer platforms like Sci-Hub. Beyond this short term solution, I hope this pushes for the development of arXiv-based journals, like Gower’s Discrete Analysis. Actually, we [statisticians] should start planing a Statistics version of it!

The second item is about  Elsevier developing its own impact factor index, CiteScore. While I do not deem the competition any more relevant for assessing research “worth”, seeing a publishark developing its own metrics sounds about as appropriate as Breithart News starting an ethical index for fake news. I checked the assessment of Series B on that platform, which returns the journal as ranking third, with the surprising inclusion of the Annual Review of Statistics and its Application [sic], a review journal that only started two years ago, of Annals of Mathematics, which does not seem to pertain to the category of Statistics, Probability, and Uncertainty, and of Statistics Surveys, an IMS review journal that started in 2009 (of which I was blissfully unaware). And the article in Nature points out that, “scientists at the Eigenfactor project, a research group at the University of Washington, published a preliminary calculation finding that Elsevier’s portfolio of journals gains a 25% boost relative to others if CiteScore is used instead of the JIF“. Not particularly surprising, eh?!

When looking for an illustration of this post, I came upon the hilarious quote given at the top: I particularly enjoy the newspeak reversal between the tree and the vine,  the parasite publishark becoming the support and the academics the (invasive) vine… Just brilliant! (As a last note, the same issue of Nature mentions New Zealand aiming at getting rid of all invasive predators: I wonder if publishing predators are also included!)

## delayed & robbed in London [CFE-CMStatistics 2015]

Posted in Kids, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on December 26, 2015 by xi'an

Last Sunday, I gave a talk on delayed acceptance at the 9th International Conference on Computational and Financial Econometrics (CFE 2015), joint with CMStatistics 2015, in London. This was a worthwhile session, with other talks by Matias Quiroz, on subsampling strategies for large data, David Frazier, on our joint paper about the consistency of ABC algorithms, and James Ridgway not on Pima Indians! And with a good-sized audience especially when considering the number of parallel sessions (36!). Earlier that day, I also attended an equally interesting session on the calibration of misspecified Bayesian models including talks by Peter Green [with a potential answer to the difficulty of parameters on the boundaries by adding orthogonal priors on those boundaries] and Julien Stoehr. calibrating composite likelihoods on Gaussian random fields. In the evening I went to a pub I had last visited when my late friend Costas Goutis was still at UCL and later enjoyed a fiery hot rogan josh.

While I could have attended two more sessions the next morning, I took advantage of the nice café in the Gower Street Waterstones to work a few hours with co-authors (and drink a few litres of tea from real teapots). Despite this quite nice overall experience, the 36 parallel session and the 1600 plus attendants at the conference still make wonder at the appeal of such a large conference and at the pertinence of giving a talk in parallel with so many other talks. And on about all aspects of statistics and econometrics. One JSM (or one NIPS) is more than enough! And given that many people only came for delivering their talk, there is very little networking between research teams or mentoring of younger colleagues, as far as I can tell. And no connection with a statistical society (it would be so nice if the RSS annual conference could only attract 1600 people!). Only a “CMStatistics working group” of which I discovered I was listed as a member [and asked for removal, so far with no answer]. Whose goals and actions are unclear, except to support Elsevier journals with special issues apparently constructed on the same pattern as this conference was organised, i.e., by asking people to take care [for free!] of gathering authors on a theme of their choice. And behind this “working group” an equally nebulous structure called ERCIM

While the “robbed” in the title could be interpreted as wondering at the reason for paying such high registration fees (£250 for very early birds), I actually got robbed of my bicycle while away at the conference. Second bike stolen within a calendar year, quite an achievement! This was an old 1990 mountain bike I had bought in Cornell and carried back to France, in such a poor state that I could not imagine anyone stealing it. Wrong prior, obviously.

## Approximate reasoning on Bayesian nonparametrics

Posted in Books, Statistics, University life with tags , , on July 7, 2015 by xi'an

[Here is a call for a special issue on Bayesian nonparametrics, edited by Alessio Benavoli , Antonio Lijoi and Antonietta Mira, for an Elsevier journal I had never heard of previously:]

The International Journal of Approximate Reasoning is pleased to announce a special issue on “Bayesian Nonparametrics”. The submission deadline is *December 1st*, 2015.

The aim of this Special Issue is twofold. First, it is to give a broad overview of the most popular models used in BNP and their application in
Artificial Intelligence, by means of tutorial papers. Second, the Special Issue will focus on theoretical advances and challenging applications of BNP with special emphasis on the following aspects:

• Methodological and theoretical developments of BNP
• Treatment of imprecision and uncertainty with/in BNP methods
• Formal applications of BNP methods to novel applied problems
• New computational and simulation tools for BNP inference.

## a paradox in decision-theoretic interval estimation (solved)

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on October 4, 2012 by xi'an

In 1993, we wrote a paper [with George Casella and Gene/Juinn Hwang] on the paradoxical consequences of using the loss function

$\text{length}(C) - k \mathbb{I}_C(\theta)$

(published in Statistica Sinica, 3, 141-155) since it led to the following property: for the standard normal mean estimation problem, the regular confidence interval is dominated by the modified confidence interval equal to the empty set when is too large… This was first pointed out by Jim Berger and the most natural culprit is the artificial loss function where the first part is unbounded while the second part is bounded by k. Recently, Paul Kabaila—whom I met in both Adelaide, where he quite appropriately commented about the abnormal talk at the conference!,  and Melbourne, where we met with his students after my seminar at the University of Melbourne—published a paper (first on arXiv then in Statistics and Probability Letters) where he demonstrates that the mere modification of the above loss into

$\dfrac{\text{length}(C)}{\sigma} - k \mathbb{I}_C(\theta)$

solves the paradox:! For Jeffreys’ non-informative prior, the Bayes (optimal) estimate is the regular confidence interval. besides doing the trick, this nice resolution explains the earlier paradox as being linked to a lack of invariance in the (earlier) loss function. This is somehow satisfactory since Jeffreys’ prior also is the invariant prior in this case.

## news from Elsevier

Posted in Books, Statistics, University life with tags , on July 4, 2012 by xi'an

Here is an email I got today from Elsevier:

We are pleased to present the latest Impact Factors for Elsevier’s Mathematics and Statistics journals.

 Journal of Multivariate Analysis 0.879

So there are very few journals published by Elsevier in the statistics field, which may explain for the lack of strong support for the boycott launched by Tim Gowers and others. Also, the impact factors are not that great either. Not so suprising for Statistics and Probability Letters, given that they publish a high number of papers of uneven quality, but also gets a minimal 1. So it does not make too much sense for Elsevier to flout such data. (Once again, impact factors should not be used for assessing the quality of a journal and even less of a paper!)

## new Elsevier journal!

Posted in Books, Statistics, University life with tags , , on May 11, 2012 by xi'an

Elsevier is launching a new journal called Spatial Statistics, whose goal is…

“…to be the leading journal in the field of spatial statistics. It publishes articles at the highest scientific level concerning important and timely developments in the theory and applications of spatial and spatio-temporal statistics. It favors manuscripts that present theory generated by new applications, or where new theory is applied to an important spatial problem.”

Given the Elsevier tradition of charging absurd amounts for journals, this journal is “only” 475 euros / USD 662 for libraries and institutions. (Which is actually a lot for a new journal with no credential. And does not mean much given the “bundling” strategy of Elsevier.) And there are caveats, like the unbelievable fee of \$3,000 for Open Source publishing (“excludes taxes and other potential author fees”…) and the prohibition to post the final version of one’s paper on arXiv. (what the journal turns into a beautifully newspeak “right”: “the right to post a pre-print version of the journal article on Internet websites“). Hence, as much as I appreciate the idea of dedicating a journal to the many issues pertaining to the specific area of spatial statistics, I stick with my support of The Cost of Knowledge pledge “not to submit a paper to an Elsevier journal, not to referee for an Elsevier journal, not to join an editorial board of an Elsevier journal“. (Elsevier has recently responded to this boycott call by making minor proposals analysed in depth by Tim Gowers.)

## 64 [kcal.]

Posted in Kids, Running, Statistics, University life with tags , , , , , , , , , on April 21, 2012 by xi'an

In the science leaflets of Le Monde this weekend, the highlighted number was 64 (along a short tribune by Cédric Villani reminding the readers of the 3% error rate in the opinion polls inundating the news these days…) Sixty-four (64 kcal/day) for the number of calories a U.S. youth should reduce his/her daily intake to prevent obesity. This stroke me as ridiculously small and I thus went to check further.

First, 64 calories is small in that a (boiled) egg brings over 64 calories, an apple reaches 75 calories (not that there is any call for reducing apple consumption!), a plain bagel or a muffin is 190 calories, a pint of beer amounts to 200 calories,  the smallest French fries portion at MacDonald is three times this amount, &tc. And, in terms of exercising, running (ok, faster than 7 miles per hour!) or active rock-climbing for a mere 4 minutes burns more than 64 calories. (If I can put some trust in this calculator!)

The reference paper containing this figure is “Reaching the Healthy People Goals for Reducing Childhood Obesity” by Wang,  Orleans, and Gortmaker, from Columbia, Harvard, and Princeton, resp., published in the American Journal of Preventive Medicine. (Incidentally, an Elsevier journal, making the open access to the paper the most surprising!) The first thing in the paper is that the 64 figure is the average reduction to stop the nation’s increase in the weight of U.S. youth, not to eliminate obesity. The second one I read from the small prints is “obesity was defıned as having a BMI greater than or equal to the age- and gender-specifıc 95th percentile on the CDC growth charts“. This means, according to the above chart, a weight above 95 kg or 210 pounds at the age of 20… I first thought this chart was based on the current population, which would have been non-sensical as the proportion would have remained at 5%! It seems to be connected to the 2000 figures, as the paper only links with the CDC website. However, this definition of obesity is problematic, as it sems to be non-stationary, i.e. to evolve with the population and hence under-estimate the problem (unless one uses a 1970 or 1950 reference). Third, and connected to the above, the goal is not expressed in terms of an ideal CDC chart but in terms of a percentage of obesity. For instance, to reach the figure of 5% of obeses, a reduction of 177 calories per day is necessary among teenagers (still less than a fries portion!), reaching 286 for non-hispanic black teenagers.

The (very limited) statistical analysis is summarized by “Average annual changes in obesity prevalence and body weight in the U.S. youth population were estimated by fıtting regression models to the compiled NHANES1971–2008 data (…) controlling for age, gender, and race/ethnicity (..) using SUDAAN, version 10.0.1”. The detail of which regressors were used does not appear in the paper, as far as I read, although the note at the bottom of Table 1 seems to hint at the calendar time as being the only regressor. This regression is only used to determine a year in the past corresponding to the goals set by the federal government (Healthy People 2020 and… Healthy People 2010, although the paper just appeared. In 2012, indeed!) of an “acceptable” percentage of obesity in the population. The “daily energy requirement [corresponding to those target weights] was estimated using published equations of basal metabolic rate and activity-related energy expenditure”, nothing more sophisticated than that…

There are similarly surprising figures in the paper: eliminating sweetened beverages from schools would only mean a 12 kcal/day gain, introducing after school physical education another mere 25 kcal/day… Overall, I am slightly puzzled by the amount of publicity this paper received, considering its limited methodological input, from Le Monde to Science Daily (which confuses the published figure with the “64-calorie difference between consumption and expenditure“).