In addition to the solution to the wrong problem, Le Monde of last weekend also dedicated a full page of its Science leaflet to the coverage of Michael Mann’s hockey curve of temperature increase and the hard time he has been given by climato-skeptics since its publication in 1998… The page includes an insert on Ed Wegman’s 2006 [infamous] report for the U.S. Congress, amply documented on Andrew’s blog. And mentions the May 2011 editorial of Nature on the plagiarism investigation. (I reproduce it above as it is not available on the Le Monde website.)
Archive for plagiarism
A few weeks ago, I was asked to act as an external referee for a PhD thesis. This thesis involved some improvement upon standard statistical methodology and applications to another field. When I eventually got the PhD document, I discovered that it started with a preface (written by the PhD student) containing claims that the student’s work has been used by co-workers, incl. the PhD supervisor, and published in a refereed journal without the student’s name nor agreement, but also with some fabricated data… This was quite a shock as I had not been made aware of this super-delicate issue a priori. And I had not information on the published piece of work, which seemed to be in the other field (I have not been able to find it since then). When I complained to the university, I got transferred to the dean of graduate studies, who almost immediately withdrew the demand for a PhD evaluation [by me]…
I find the whole affair quite bizarre. and somewhat perturbating. Indeed, when I recontacted the university to mention my concerns, I got the following [edited and possibly translated] email
As I’m sure you can appreciate, this is an unusual case. [We were] not able to alert you to this when nominating you as examiners, as it is important that we follow our University process and allow examiners to reach independent conclusions as to the value of the work before them. [We are] bound by our PhD Statute and would be prejudicing the examination process if [we] provided additional information to examiners. [We] would also be providing a route for the candidate to appeal the outcome of the examination process.
This does not make any sense to me given that any referee of this thesis is going to hit the same case when reading the first pages of the thesis… Either the PhD student should remove this complaint from the PhD document (but this does not seem right, given that there is a published paper containing some of the results claimed in the thesis, even though referees from Statistics are very unlikely to be aware of it, as, again, I could not find the corresponding paper), or the whole information should be provided to the referees of the thesis so that they can judge the matter in full light… I do not see how I could pursue the matter any further, but the whole story left me feeling quite uncomfortable.
In what seems like an endless cuRse, I found this week I had to re-grade a dozen R exams a TA’s did not grade properly! The grades I (X) got are plotted below against those of my TA (Y). There is little connection between both gradings… As if this was not enough trouble, I also found exactly duplicated R codes in another R project around Introducing Monte Carlo methods with R that was returned a few weeks ago. Meaning I will have to draft a second round exam… (As Tom commented on an earlier post, team resolution of a given problem may be a positive attitude, but in the current case one student provided an A⁺⁺ answer, while two others clearly drafted an hasty resolution from the original.) Nonetheless, do not worry, I still love [teaching] R!
I was updating my entries on HAL from my arXives and found this top ten ranking of my papers:
- Sélection bayésienne de variables en régression linéaire, with A. Guillin and J.-M. Marin inria-00077857
- Adaptive Importance Sampling in General Mixture Classes, with O. Cappé, R. Douc, A. Guillin and J.-M. Marin inria-00181474/hal-00180669
- A Bayesian reassessment of nearest-neighbour classification, with L. Cucala, J.-M. Marin and D.M. Titterington inria-00143783
- Deviance Information Criteria for Missing Data Model, with G. Celeux, F. Forbes and D.M. Titterington inria-00071724
- Minimum variance importance sampling via Population Monte Carlo, with R. Douc, A. Guillin and J.-M. Marin inria-00070316
- Computational and Inferential Difficulties with Mixture Posterior Distributions, with J.-M. Marin inria-00073049
- Are risk averse agents more optimistic? A Bayesian estimation approach, with S. Benmansour, E. Jouini, C. Napp and J.-M. Marin halshs-00163678
- Convergence of adaptive sampling schemes, with R. Douc, A. Guillin and J.-M. Marin inria-00070522
- Brownian Confidence Bands on Monte Carlo Output, with W. Kendall and J.-M. Marin inria-00070571
- Iterated importance sampling in missing data problems, with G. Celeux and J.-M. Marin inria-00070473
Nothing much to comment except that those are only recent papers (obviously, since HAL also is a recent creation), a large majority of which revolve around population Monte Carlo (and almost all co-authored with Jean-Michel Marin!). The #9 with WIlfrid Kendall and Jean-Michel Marin is clearly very popular as someone attempted to plagiarise it! The #1 comes as a real surprise, given that it is in French and more of a survey.
Last morn, I had the surprise of receiving the following email:
This is to inform you that the following abstract has been submitted to the 3rd International Conference of the ERCIM WG on COMPUTING & STATISTICS (ERCIM’10)
Title: Goodness of Fit Via Mixtures of Beta distributions
Keywords: nonparametric estimation, posterior conditional predictive p-value.
Abstract: We consider a Bayesian approach to goodness of fit, that is, to the problem of testing whether or not a given parametric model is compatible with the data at hand . Since we are concerned with a goodness of fit problem, it is more of interest to consider a functional distance to the tested model d(F;F) as the basis of our test, rather than the corresponding Bayes factor, since the later puts more emphasis on the parameters. It is both of high interest and of strong difficulty to come up with a satisfactory notion of a Bayesian test for goodness ofit to a distribution or to a family of distributions.
The abstract is a plagiarism of your work.
I am informing you of about this in case the author has tried to plagiarize the whole paper. The same author has submitted a second abstract plagiarizing another paper. The author uses bogus affiliations and I cannot trace his institution in case he has one.
It is somehow comforting to see that such a gross example of plagiarism can get detected, despite the fact that our paper never got published. Although I am sure there must be conferences that do not apply any filter on the submission…
This paper with Judith Rousseau was once submitted to Series B, but I could not come to complete the requested revision for programming motives, the task of modifying the several thousand lines of C code driving the beta mixture estimation filling me with paralysing dread! This is actually the time when I stopped programming in C (the fact that I ever really programmed in C is actually debatable!). This is unfortunate as the spirit of the paper was quite nice, using an idea borrowed from Verdinelli and Wasserman to build a genuine Bayesian goodness of fit test… I do not think there is much to salvage at this later stage, given the explosion of Bayesian non-parametrics.