exoplanets at 99.999…%

The latest Significance has a short article providing some coverage of the growing trend in the discovery of exoplanets, including new techniques used to detect those expoplanets from their impact on the associated stars. This [presumably] comes from the recent book Cosmos: The Infographics Book of Space [a side comment: new books seem to provide material for many articles in Significance these days!] and the above graph is also from the book, not the ultimate infographic representation in my opinion given that a simple superposition of lines could do as well. Or better.

¨A common approach to ruling out these sorts of false positives involves running sophisticated numerical algorithms, called Monte Carlo simulations, to explore a wide range of blend scenarios (…) A new planet discovery needs to have a confidence of (…) a one in a million chance that the result is in error.”

The above sentence is obviously of interest, first because the detection of false positives by Monte Carlo hints at a rough version of ABC to assess the likelihood of the observed phenomenon under the null [no detail provided] and second because the probability statement in the end is quite unclear as of its foundations… Reminding me of the Higgs boson controversy. The very last sentence of the article is however brilliant, albeit maybe unintentionaly so:

“To date, 1900 confirmed discoveries have been made. We have certainly come a long way from 1989.”

Yes, 89 down, strictly speaking!

Statistics done wrong [book review]

no starch press (!) sent me the pdf version of this incoming book, Statistics done wrong, by Alex Reinhart, towards writing a book review for CHANCE, and I read it over two flights, one from Montpellier to Paris last week, and from Paris to B’ham this morning. The book is due to appear on March 16. It expands on a still existing website developed by Reinhart. (Discussed a year or so away on Andrew’s blog, most in comments, witness Andrew’s comment below.) Reinhart who is, incidentally or not, is a PhD candidate in statistics at Carnegie Mellon University. After apparently a rather consequent undergraduate foray into physics. Quite an unusual level of maturity and perspective for a PhD student..!

“It’s hard for me to evaluate because I am so close to the material. But on first glance it looks pretty reasonable to me.” A. Gelman

Overall, I found myself enjoying reading the book, even though I found the overall picture of the infinitely many mis-uses of statistics rather grim and a recipe for despairing of ever setting things straight..! Somehow, this is an anti-textbook, in that it warns about many ways of applying the right statistical technique in the wrong setting, without ever describing those statistical techniques. Actually without using a single maths equation. Which should be a reason good enough for me to let all hell break loose on that book! But, no, not really, I felt no compunction about agreeing with Reinhart’s warning and if you have reading Andrew’s blog for a while you should feel the same…

“Then again for a symptom like spontaneous human combustion you might get excited about any improvement.” A. Reinhart (p.13)

Maybe the limitation in the exercise is that statistics appears so much fraught with dangers of over-interpretation and false positive and that everyone (except physicists!) is bound to make such invalidated leaps in conclusion, willingly or not, that it sounds like the statistical side of Gödel’s impossibility theorem! Further, the book moves from recommendation at the individual level, i.e., on how one should conduct an experiment and separate data for hypothesis building from data for hypothesis testing, to a universal criticism of the poor standards of scientific publishing and the unavailability of most datasets and codes. Hence calling for universal reproducibility protocols that reminded of the directions explored in this recent book I reviewed on that topic. (The one the rogue bird did not like.) It may be missing on the bright side of things, for instance the wonderful possibility to use statistical models to produce simulated datasets that allow for an evaluation of the performances of a given procedure in the ideal setting. Which would have helped the increasingly depressed reader in finding ways of checking how wrongs things could get..! But also on the dark side, as it does not say much about the fact that a statistical model is most presumably wrong. (Maybe a physicist’s idiosyncrasy!) There is a chapter entitled Model Abuse, but all it does is criticise stepwise regression and somehow botches the description of Simpson’s paradox.

“You can likely get good advice in exchange for some chocolates or a beer or perhaps coauthorship on your next paper.” A. Reinhart (p.127)

The final pages are however quite redeeming in that they acknowledge that scientists from other fields cannot afford a solid enough training in statistics and hence should hire statisticians as consultants for the data collection, analysis and interpretation of their experiments. A most reasonable recommendation!

top posts for 2014

Here are the most popular entries for 2014:

What I appreciate from that list is that (a) book reviews [of stats books] get a large chunk (50%!) of the attention and (b) my favourite topics of Bayesian testing, parallel MCMC and MCMC on zero measure sets made it to the top list. Even the demise of the Bayes factor that was only posted two weeks ago!

the dark defiles

The final and long-awaited volume of a series carries so much expectation that it more often than not ends up disappointing [me]. The Dark Defiles somewhat reluctantly falls within this category… This book is the third instalment of Richard K. Morgan’s fantasy series, A Land Fit for Heroes. Of which I liked mostly the first volume, The Steel Remains. When considering that this first book came out in January 2009, about six years ago, this may explains for the somewhat desultory tone of The Dark Defiles. As well as the overwhelming amount of info-dump needed to close the many open threads about the nature of the Land Fit for Heroes.

“They went. They dug. Found nothing and came back, mostly in the rain.”

[Warning: some spoilers in the following!] The most striking imbalance in the story is the rather mundane pursuits of the three major heroes, from finding an old sword to avenging fallen friends here and there, against the threat of an unravelling of the entire Universe and of the disappearance of the current cosmology.  In addition, the absolute separation maintained by Morgan between Archeth and Ringil kills some of the alchemy of the previous books and increases the tendency to boring inner monologues. The volume is much, much more borderline science-fiction than the previous ones, which obviously kills some of the magic, given that the highest powers that be sound like a sort of meta computer code that eventually gives Ringil the ultimate decision. As often, this mix between fantasy and science-fiction is not much to my taste, since it gives too much power to the foreign machines, the Helmsmen, which sound like they are driving the main human players for very long term goals. And which play too often deus ex machina to save the “heroes” from unsolvable situations. Overall a wee bit of a lengthy book, with a story coming to an unexpected end in the very final pages, leaving some threads unexplained and some feeling that style prevailed over story. But nonetheless a page turner in its second half.

amazonish thanks (& repeated warning)

As in previous years, at about this time, I want to (re)warn unaware ‘Og readers that all links to and more rarely to found on this blog are actually susceptible to earn me an advertising percentage if a purchase is made by the reader in the 24 hours following the entry on Amazon through this link, thanks to the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Unlike last year, I did not benefit as much from the new edition of Andrew’s book, and the link he copied from my blog entry… Here are some of the most Og-unrelated purchases:

Once again, books I reviewed, positively or negatively, were among the top purchases… Like a dozen Monte Carlo simulation and resampling methods for social science , a few copies of Naked Statistics. And again a few of The Cartoon Introduction to Statistics. (Despite a most critical review.) Thanks to all of you using those links and feeding further my book addiction, with the drawback of inducing even more fantasy book reviews.

a pile of new books

IMG_2663I took the opportunity of my weekend trip to Gainesville to order a pile of books on amazon, thanks to my amazon associate account (and hence thanks to all Og’s readers doubling as amazon customers!). The picture above is missing two  Rivers of London volumes by Ben Aaraonovitch that I already read and left at the office. And reviewed in incoming posts. Among those,

(Obviously, all “locals” sharing my taste in books are welcome to borrow those in a very near future!)

3,000 posts and 1,000,000 views so far…

As the ‘Og went over its [first] million views and 3,000 posts since its first post in October 2008, the most popular entries (lots of book reviews, too many obituaries, and several guest posts):

