Archive for bibliometrics

Automated promotion

Posted in University life with tags , on December 9, 2010 by xi'an

Olivier Cappé pointed out to me this reference where Cyril Labbé explains how to achieve a high ranking on Google Scholar with (fake) automatically generated papers… Of course, the ranking does not stand any close examination but nonetheless…

Citation abuses

Posted in Statistics with tags , , , , , , on October 21, 2009 by xi'an

“There is a belief that citation statistics are
inherently more accurate because they
substitute simple numbers for complex
judgments, and hence overcome the
possible subjectivity of peer review.
But this belief is unfounded.”

A very interesting report appeared in the latest issue of Statistical Science about bibliometrics and its abuses (or “bibliometrics as an abuse per se”!). It was commissioned by the IMS, the IMU and the ICIAM. Along with the set of comments (by Bernard Silverman, David Spiegelhalter, Peter Hall and others) also posted in arXiv, it is a must-read!

“even a casual inspection of the h-index and its variants shows
that these are naïve attempts to understand complicated citation
records. While they capture a small amount of information about
the distribution of a scientist’s citations, they lose crucial
information that is essential for the assessment of research.”

The issue is not gratuitous. While having Series B ranked with a high impact factor is an indicator of the relevance of a majority of papers published in the journal, there are deeper and more important issues at stake. Our grant allocations, our promotions, our salary are more and more dependent on these  “objective” summary or “comprehensive” factors. The misuse of bibliometrics stems from government bodies and other funding agencies wishing to come up with assessments of the quality of a researcher that bypass peer reviews and, more to the point, are easy to come by.

The report points out the many shortcomings of journal impact factors. Its two-year horizon is very short-sighted in mathematics and statistics. As an average, it is strongly influenced by outliers, like controversial papers or broad surveys, as shown by the yearly variations of the thing. Commercial productions like Thomson’s misses a large part of the journals that could quote a given paper and this is particularly true for fields at the interface between disciplines and for emergent topics. The variation in magnitude between disciplines is enormous and based on the impact factor I’d rather publish one paper in Bioinformatics than four in the Annals of Statistics… The second issue is that the “quality” of the journal does not automatically extend to all papers it publishes: multiplying papers by the journal impact factor is thus ignoring variation to an immense extent. The report illustrates this with the fact that a paper published in a journal with half the impact factor of another journal has a 62% probability to be more quoted than if it had been published in this other journal! The h-factor is similarly criticised by the report.  More fundamentally, the report also analyses the multicriteria nature of citations, which cannot be reflected (only) as a measure of worth of the quoted papers.