Archive for Elsevier

a paradox in decision-theoretic interval estimation (solved)

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on October 4, 2012 by xi'an

In 1993, we wrote a paper [with George Casella and Gene/Juinn Hwang] on the paradoxical consequences of using the loss function

\text{length}(C) - k \mathbb{I}_C(\theta)

(published in Statistica Sinica, 3, 141-155) since it led to the following property: for the standard normal mean estimation problem, the regular confidence interval is dominated by the modified confidence interval equal to the empty set when is too large… This was first pointed out by Jim Berger and the most natural culprit is the artificial loss function where the first part is unbounded while the second part is bounded by k. Recently, Paul Kabaila—whom I met in both Adelaide, where he quite appropriately commented about the abnormal talk at the conference!,  and Melbourne, where we met with his students after my seminar at the University of Melbourne—published a paper (first on arXiv then in Statistics and Probability Letters) where he demonstrates that the mere modification of the above loss into

\dfrac{\text{length}(C)}{\sigma} - k \mathbb{I}_C(\theta)

solves the paradox:! For Jeffreys’ non-informative prior, the Bayes (optimal) estimate is the regular confidence interval. besides doing the trick, this nice resolution explains the earlier paradox as being linked to a lack of invariance in the (earlier) loss function. This is somehow satisfactory since Jeffreys’ prior also is the invariant prior in this case.

news from Elsevier

Posted in Books, Statistics, University life with tags , on July 4, 2012 by xi'an

Here is an email I got today from Elsevier:

We are pleased to present the latest Impact Factors for Elsevier’s Mathematics and Statistics journals.

Statistics and Probability Letters

0.498

Journal of Statistical Planning and Inference

0.716

Journal of Multivariate Analysis

0.879

Computational Statistics and Data Analysis

1.028

So there are very few journals published by Elsevier in the statistics field, which may explain for the lack of strong support for the boycott launched by Tim Gowers and others. Also, the impact factors are not that great either. Not so suprising for Statistics and Probability Letters, given that they publish a high number of papers of uneven quality, but also gets a minimal 1. So it does not make too much sense for Elsevier to flout such data. (Once again, impact factors should not be used for assessing the quality of a journal and even less of a paper!)

new Elsevier journal!

Posted in Books, Statistics, University life with tags , , on May 11, 2012 by xi'an

Elsevier is launching a new journal called Spatial Statistics, whose goal is…

“…to be the leading journal in the field of spatial statistics. It publishes articles at the highest scientific level concerning important and timely developments in the theory and applications of spatial and spatio-temporal statistics. It favors manuscripts that present theory generated by new applications, or where new theory is applied to an important spatial problem.”

Given the Elsevier tradition of charging absurd amounts for journals, this journal is “only” 475 euros / USD 662 for libraries and institutions. (Which is actually a lot for a new journal with no credential. And does not mean much given the “bundling” strategy of Elsevier.) And there are caveats, like the unbelievable fee of $3,000 for Open Source publishing (“excludes taxes and other potential author fees”…) and the prohibition to post the final version of one’s paper on arXiv. (what the journal turns into a beautifully newspeak “right”: “the right to post a pre-print version of the journal article on Internet websites“). Hence, as much as I appreciate the idea of dedicating a journal to the many issues pertaining to the specific area of spatial statistics, I stick with my support of The Cost of Knowledge pledge “not to submit a paper to an Elsevier journal, not to referee for an Elsevier journal, not to join an editorial board of an Elsevier journal“. (Elsevier has recently responded to this boycott call by making minor proposals analysed in depth by Tim Gowers.)

64 [kcal.]

Posted in Kids, Running, Statistics, University life with tags , , , , , , , , , on April 21, 2012 by xi'an

In the science leaflets of Le Monde this weekend, the highlighted number was 64 (along a short tribune by Cédric Villani reminding the readers of the 3% error rate in the opinion polls inundating the news these days…) Sixty-four (64 kcal/day) for the number of calories a U.S. youth should reduce his/her daily intake to prevent obesity. This stroke me as ridiculously small and I thus went to check further.

First, 64 calories is small in that a (boiled) egg brings over 64 calories, an apple reaches 75 calories (not that there is any call for reducing apple consumption!), a plain bagel or a muffin is 190 calories, a pint of beer amounts to 200 calories,  the smallest French fries portion at MacDonald is three times this amount, &tc. And, in terms of exercising, running (ok, faster than 7 miles per hour!) or active rock-climbing for a mere 4 minutes burns more than 64 calories. (If I can put some trust in this calculator!)

The reference paper containing this figure is “Reaching the Healthy People Goals for Reducing Childhood Obesity” by Wang,  Orleans, and Gortmaker, from Columbia, Harvard, and Princeton, resp., published in the American Journal of Preventive Medicine. (Incidentally, an Elsevier journal, making the open access to the paper the most surprising!) The first thing in the paper is that the 64 figure is the average reduction to stop the nation’s increase in the weight of U.S. youth, not to eliminate obesity. The second one I read from the small prints is “obesity was defıned as having a BMI greater than or equal to the age- and gender-specifıc 95th percentile on the CDC growth charts“. This means, according to the above chart, a weight above 95 kg or 210 pounds at the age of 20… I first thought this chart was based on the current population, which would have been non-sensical as the proportion would have remained at 5%! It seems to be connected to the 2000 figures, as the paper only links with the CDC website. However, this definition of obesity is problematic, as it sems to be non-stationary, i.e. to evolve with the population and hence under-estimate the problem (unless one uses a 1970 or 1950 reference). Third, and connected to the above, the goal is not expressed in terms of an ideal CDC chart but in terms of a percentage of obesity. For instance, to reach the figure of 5% of obeses, a reduction of 177 calories per day is necessary among teenagers (still less than a fries portion!), reaching 286 for non-hispanic black teenagers.

The (very limited) statistical analysis is summarized by “Average annual changes in obesity prevalence and body weight in the U.S. youth population were estimated by fıtting regression models to the compiled NHANES1971–2008 data (…) controlling for age, gender, and race/ethnicity (..) using SUDAAN, version 10.0.1″. The detail of which regressors were used does not appear in the paper, as far as I read, although the note at the bottom of Table 1 seems to hint at the calendar time as being the only regressor. This regression is only used to determine a year in the past corresponding to the goals set by the federal government (Healthy People 2020 and… Healthy People 2010, although the paper just appeared. In 2012, indeed!) of an “acceptable” percentage of obesity in the population. The “daily energy requirement [corresponding to those target weights] was estimated using published equations of basal metabolic rate and activity-related energy expenditure”, nothing more sophisticated than that…

There are similarly surprising figures in the paper: eliminating sweetened beverages from schools would only mean a 12 kcal/day gain, introducing after school physical education another mere 25 kcal/day… Overall, I am slightly puzzled by the amount of publicity this paper received, considering its limited methodological input, from Le Monde to Science Daily (which confuses the published figure with the “64-calorie difference between consumption and expenditure“).

Bayesian computing, methods and applications (and Elsevier)

Posted in Books, Statistics, University life with tags , , , , , on April 13, 2012 by xi'an

I received an email this weekend calling for submissions to a special issue of Computational Statistics and Data Analysis on the special topic Bayesian computing, methods and applications, edited by Cathy Chen, David Dunson, Sylvia Frühwirth-Schnatter, and Stephen Walker.   The theme is

The last two decades have seen an explosion in the popularity and use of Bayesian methods, largely as a result of the advances in sampling based approaches to inference. At the same time, important advances and developments in methodology have coincided with highly sophisticated breakthroughs in computational techniques. Consequently, practitioners are increasingly turning to Bayesian methods so as to effectively tackle more complex and realistic models and problems, particularly as richer sources of data continue to become available. The primary aim of the issue is to illustrate and showcase recent advances in Bayesian computation and inferential methods, as well as highlight their application to empirical problems in a broad range of areas, including econometrics, biology, finance and medicine, amongst many others. Methodological contributions that highlight recent developments in Bayesian computing are strongly encouraged.

The papers should have a computational or advanced data analytic component in order to be considered for publication. Authors who are uncertain about the suitability of their papers should contact the special issue editors. All submissions must contain original unpublished work not being considered for publication elsewhere. Submissions will be refereed according to standard procedures for Computational Statistics and Data Analysis. The deadline for submissions is 30 June 2012.

Unfortunately, this journal is published by Elsevier, the costly much too costly publisher [Computational Statistics and Data Analysis costs for instance 2,763 euros per year for institutions and libraries!, Journal of Multivariate Analysis is 2,704 euros...] and, since I am completely in agreement with the position, I have signed the Cost of Knowledge pledge a few weeks ago [although I do not yet appear on the list], meaning I now abstain from supporting the extremely unbalanced business model of Elsevier though publishing, reviewing, or (a)editing in one of the journals it publishes. (Which means I refuse referring on this sole ground about once a week now.) Even though Elsevier published a letter to mathematicians a few weeks ago, I however doubt they can modify their business model so drastically as to get down to average prices for their journals. Unless the pressure from the community is so committed and shared that the flow of submissions dries out, which I doubt will occur on a short time-scale. (The impact of a reduced submission pool on citation indices and impact factors is on another scale…) Jean-Michel pointed me to this arXiv report [to appear in Notices of the American Mathematical Society] by Douglas Arnold and Henry Cohn on the Cost of Knowledge boycott (analysed by mathematicians, not statisticians). It seems that the boycott has not impacted as much the statisticians’ community, judging from the number of signatures registered so far.

As a coincidence, I read today in the Guardian that the Wellcome Trust is pulling its weight in support of open source publishing, threatening to withdraw funding from researchers who do not “ensure that results are freely available to the public within six months of first publication”. Elsevier’s statement is not encouraging, though: “we will also remain committed to the subscription model. We want to be able to offer our customers choice, and we see that, in addition to new models the subscription model remains very much in demand.” (I do not see the connection between high subscription rates and choice, nor a proof that anyone demands high subscription rates!) Another coincidence is that I also got an email about The Open Statistics & Probability Journal, which is a free, peer reviewed, on-line journal. Which shows that some companies have found a way to manage a business model that is compatible with open access, if not a good solution in my opinion: just charge the authors $400 per published paper…

Follow

Get every new post delivered to your Inbox.

Join 598 other followers