**A** few weeks ago and then some, I [as occasional blogger!] got contacted by datazar.com to write a piece on this data-sharing platform. I then went and checked what this was all about, having the vague impression this was a platform where I could store and tun R codes, besides dropping collective projects, but from what I quickly read, it sounds more like being able to run R scripts from one’s machine using data and code stored on datazar.com. But after reading just one more blog entry I finally understood it is also possible to run R, SQL, NotebookJS (and LaTeX) directly on that platform, without downloading code or data to one’s machine. Which makes it a definitive plus with this site, as users can experiment with no transfer to their computer. Hence on a larger variety of platforms. While personally I do not [yet?] see how to use it for my research or [limited] teaching, it seems like an [yet another] interesting exploration of the positive uses of Internet to collaborate and communicate on scientific issues! With no opinion on privacy and data protection offered by the site, of course.

Just saw this nice review of R for dummies. And thought after this afternoon class that my students in the simulation course at Paris-Dauphine could clearly benefit from reading it! They in fact had a terrible time simulating a truncated normal distribution by accept-reject. As they could not get the notion of normalising constants… (Yes, indeed, this very truncated normal distribution!) Even the validity of simulating a normal variate until the truncation is satisfied was not obvious to them and they took forever to program the corresponding code. Anyway, I will certainly order the book to check for myself (after receiving

**to make sure I use the right vocabulary, even though it is a bit too light in the end…)! And write a review for CHANCE if it generates enough interest in doing so…**

I just received an email from Austria that the 'Og is now part of a blog aggregator, Particle physics planet, along with an impressive list of particle physics blogs. It is certainly an honour to be associated with this blog, even though I fear my random ratiocinations very rarely reach the shore of the particle physics universe…. (I think this is my third "aggregation" after becoming part of R-bloggers and of mathblogging.org)

As the 'Og reached its 1500th post and 3000th comment at exactly the same time, a wee and only mildly interesting Sunday morning foray in what was posted so far and attracted the most attention (using the statistics provided by wordpress). The most visited posts:

Title | Views |
---|---|

Home page | 203,727 |

In{s}a(ne)!! | 7,422 |

“simply start over and build something better” | 6,264 |

Julien on R shortcomings | 2,676 |

Sudoku via simulated annealing | 2,402 |

About | 1,876 |

Of black swans and bleak prospects | 1,768 |

Solution manual to Bayesian Core on-line | 1,628 |

Parallel processing of independent Metropolis-Hastings algorithms | 1,625 |

Bayesian p-values | 1,595 |

Bayes’ Theorem | 1,537 |

#2 blog for the statistics geek?! | 1,526 |

Do we need an integrated Bayesian/likelihood inference? | 1,501 |

Coincidence in lotteries | 1,396 |

Solution manual for Introducing Monte Carlo Methods with R | 1,340 |

Julian Besag 1945-2010 | 1,293 |

Tornado in Central Park | 1,093 |

The Search for Certainty | 1,016 |

Hence, three R posts (incl. one by Julien and one by Ross Ihaka), three (critical) book reviews, two solution manuals, two general Bayesian posts, two computational entries, one paper (with Pierre Jacob and Murray Smith), one obituary, and one photograph news report… Altogether in line with the main purpose of the 'Og. The most commented posts:

Not exactly the same as above! In particular, the posts about ABC model choice and our PNAS paper got into the list. At last, the top search terms:

Search | Views |
---|---|

surfers paradise | 1,050 |

benidorm | 914 |

introducing monte carlo methods with r | 514 |

andrew wyeth | 398 |

mistborn | 352 |

abele blanc | 350 |

nested sampling | 269 |

particle mcmc | 269 |

bayesian p-value | 263 |

julian besag | 257 |

rites of love and math | 249 |

millenium | 237 |

bayesian p value | 222 |

marie curie | 221 |

bonsai | 200 |

(out of which I removed the dozens of variations on xian's blog). I find it rather sad that both top entries are beach towns that are completely unrelated to my lifestyle and to my vacation places. Overall, more than a half of those entries do not strongly relate to the contents of the 'Og (even though I did post at length about Saunderson's Mistborn and Larsson's Millenium trilogies).

which include links to my books on Amazon, Andrew Gelman’s, Terry Tao’s, Radford Neal’s and Romain François’s blogs, the CREST stat students collective blog, and a few arXiv papers of mine’s…

Thanks to a link on R-bloggers, I was introduced to Luis Apiolaza's blog, Quantum Forest, which covers data analyses and R comments he encounters in his research as a quantitative forester/geneticist. And he works at the University of Canterbury, Christchurch, where I first taught from Bayesian Core in 2006. Which may be why he chose Bayesian Core as one of the three books he is currently reading to understand Bayesian statistics better. (The other two are Jim Albert's

*, and Bill Bolstad’s*

which is not the one I reviewed recently.) Luis has just started the book but he mentions that "the book has managed to capture my interest", which is real nice, and being annoyed by the self-contained label we put on the back cover. Which is a reaction I also got from some students when teaching the book for a week in Australia, as they thought they could take it without a probability background. Hopefully, we'll manage to complete our revision before next summer!

Following a link on R-bloggers, I ended up on this page (with a completely useless graph that only contained the pieces of information 5% in 1900 and 55% in 2000). The author (Ralph Keeney) reports on "A remarkable 55 percent of deaths for people age 15 to 64 can be attributed to decisions with readily available alternatives." This sounded to me like a highly dubious finding… So I looked at the paper itself, reading that

“A

personal decisionis a situation where an individual can make a choice among two or more alternatives. This assumes that the individual recognizes that he or she has a choice and has control of this choice.Readily available alternativesare alternatives that the decision maker would have known about and could have chosen without investing much time or money.” Ralph Keeney

**T**his categorisation of deaths is highly debatable, in that choice is not always *that* available! So I do not see how the author can assert which percentage of the individuals truly have *control of the choice*… (For instance, can people refuse doing dangerous jobs when they desperately need a job? or when the dangerousness is an abstract concept as, say, for a Fukushima worker? Is obesity a sheer matter of will?) Furthermore, the jump from 5% to 55% is also highly shaky: “Clearly, one should not put much credibility in this 22% for 1950 or the corresponding 5% for 1900”. In the end, tt seems that the whole issue of the paper is about the amount of information: “in 1900 the knowledge about and ability to avoid many of the causes of death would seem to be much lower than in 2000”. So life has not been getting more dangerous or people sillier, simply information about the causes of deaths has become more widespread. I am thus surprised at the low level of academic input contained in the paper (look at the “life-saving decisions’!), which may actually explain for the echo it found on the blogosphere.* (This post also appeared on the Statistics Forum.)*

This morning, I noticed that none of my R related posts had appeared on R-bloggers for the past fortnight… After investigating, this was caused by…cut-and-paste! Indeed, when advertising about the special issue of TOMACS Arnaud Doucet and I edit about Monte Carlo methods in Statistics, I copied the main parts from the pdf announcement, straight out of Acrobat, and the word "field" was used, involving a ligature between the f and the i that did not get copied in proper UTF-8:

error on line 434 at column 330: Input is not proper UTF-8, indicate encoding ! Bytes: 0x0C 0×65 0x6C 0×64

So here are the entries in the ‘Og for the past 16 days that could have been of interest for R-bloggers readers: