## If the Dickey-Savage ratio does not hold…

Posted in Statistics with tags , on September 30, 2009 by xi'an

or rather is arbitrary, then this should cause problems for methods, papers, and books that do use it, no…?! This is the question I got on Monday from an Og reader.

Actually, not necessarily: when looking for instance at O’Hagan and Forster’s Advanced Theory of Statistics 2B (2004, 7.16), the Bayes factor they construct in the ${\mathcal N}(\mu,\sigma^2)$ model, when testing for $\sigma=\sigma_0$, is correct for the same reason the “proof” of the Dickey-Savage ratio was accepted, namely the use of the “right” version of the conditional density. Similarly, when Chen, Shao and Ibrahim (2000, Section 5.10.3) introduce the ratio, and the Verdinelli-Wasserman (1996) generalisation, they implement the Monte Carlo approximation using a specific version of the full conditionals.

The plus of our perspective is therefore to give a general representation that does not involve unnatural constraints on the priors. The example below is taken from a probit example in the incoming note with Jean-Michel Marin, comparing the Dickey-Savage-like approximation to the Bayes factor with an harmonic mean version and the unbeatable Chib’s solution.

I nonetheless found [by Googling] one case where the Dickey-Savage ratio was leading to a definition problem, namely in a preprint posted by Wetzels, Grasman, and Wagenmakers,  where the authors derive the Bayes factor as the Dickey-Savage ratio under an encompassing prior constructed earlier by Klugkist et al. (2005), using a limiting argument via L’Hospital rule that seems equally contradictory with measure theoretic principles. Paradoxically, the authors mention the Borel-Kolmogorov paradox, i.e. the dependence on the conditioning σ-algebra, as a possible issue with their prior construction but, while their appendix A clearly concludes that the limit is arbitrary, they  still evacuate the issue of the choice of the version of the conditional density.

## Introduction to Bayesian analysis

Posted in Statistics, University life with tags , , , on September 29, 2009 by xi'an

Yesterday, I started my standard master course on Bayesian analysis, based on my book The Bayesian Choice. Here are the slides (in English)

They definitely need to be upgraded into Beamer and enlarged to some extent (the test part is separate) but not yet this year, I am afraid! The course is completed in about six weeks, implying the students have to work real hard between classes, and to ensure this, they must hand back solved problems from the chapter(s) covered in the previous class at the beginning of the next class.

## Happy B’day, ‘Og!

Posted in Books, Kids, Linux, Mountains, Running, Statistics, Travel, University life, Wines with tags , , , on September 28, 2009 by xi'an

This blog started a year ago with the definitely poor prediction that “I will most likely not update it fairly often”… I have been spending an increasing amount of time on the Og, in reply to an increasing number of connections, and I am not sure this is an efficient strategy, but I do like this media as a way to immediately discuss about read papers and tech’ reports. (I wish journals like Bayesian Analysis would allow for a bloggin’ section, even though I understand this is a severe commitment because of moderation issues.) As pointed by Andrew Gelman, bloggin’ is dangerously addictive! Here’s, by popular demand, the top of the hits over the past year:

Title Views
Of black swans and bleak prospects 898
Bayes’ Theorem 595
ABC in Paris, June 26 383
Sequential Monte Carlo without likelihood 279
Reference prior for logistic regression 257
Bayesian p-values 229
Anomalies in the Iranian election 212

Interestingly, and reassuringly!, besides both first entries that address highly popular topics, the other hits are about statistical topics! And the list includes Bayes’ Theorem, ABC and SMC.

## New typos in Monte Carlo Statistical Methods

Posted in Books, Statistics with tags , , on September 28, 2009 by xi'an

Three weeks ago, I got this email from Amir Alipour, an Iranian student, about typos in Monte Carlo Statistical Method:

“I found some typos in the book which were not reported at your website. I list them blow, I would appreciate if you let me know if I`m right.
1.       Page 4, line 9,  $(\theta_1,\ldots,\theta_n,p_1,\ldots,p_n)$, the index should not be $k$ instead of $n$?
2.       Page 4, example 1.3, last line, $n>q$, should be $n>=q$ (as we have $x_0$ ).
3.       Page 5, the likelihood of MA(q), it seems $\sigma^{-(n+q)}$ should  change to  $\sigma^{-(n+q+1)}$.
4.       Page 8, formula (1.10).  The gradient symbol $\nabla$ is used for the first time without introducing, while it is used for the second time on page 19 with introducing.
5.       Page 8, Example 1.6, the log part in $\psi(\theta)$, should  change to $\log(-1/(2\theta_2))$.
6.       Page 10, in modified Bessel function, $z$ should change to $t$.
7.       Page 10, Example 1.9, in the likelihood function, the power of $\sigma$ should be $-n$, and the power of the function under product should be $-\frac{p+1}{2}$. (Even Figure 1.1 is not consistent with likelihood)”

and I have posted those new typos on the associated webpage. Amir Alipour has thus managed to find seven yet undiscovered typos in the first ten pages of the book! I am quite grateful to Amir Alipour for signaling those typos. Especially the final one which is due to an intented presentation of the $t$ density as a polynomial, with a poor wording: the likelihood of a $t$ sample is proportional to a power of a polynomial in the location parameter. (And there still is a typo since $\sigma^{n(p+1)/2}$ should be $\sigma^{2n/(p+1)}$…) Now I can only hope Amir Alipour can proceed through the whole book with the same amount of dedication!

## Spook Country

Posted in Books, Travel with tags , , , on September 27, 2009 by xi'an

Say,” he said to Brown who was looking at his phone as if he wished he knew a way to torture it, “this NSA data-mining thing…”

When I attended MaxEnt 2009 in Oxford, I bought William Gibson‘s Spook Country at the university bookstore as it was on sale for \$5… I have read it during the past week, finishing it this morning in the 5:30 bus to Helsinki airport, and I am quite disappointed. (Incidentally, I visited yesterday the Akateeminen Kirjakauppa bookstore in Helsinki and found there an incredibly well-provided fantasy section—in English—that beats by far the major chains in England!) I love Gibson‘s early cyber-punk books and I can still remember the excitement of reading Neuromancer for the first time, while I was completing my thesis.

“We have been buying into data mining at Blue Ant.”

The style was very innovative, sharp and tense, with this then-novel use of existing brands to shorten the descriptions, and the story was gripping, with insights of what would become the cyberspace. Even the later Virtual Light had fascinating findings, like its delivery cyclist and its recycling of the Golden Gate Bridge into a squatter community.

“What does Chombo…do?” “It implements finite difference methods for the solution of partial differential equations, on block structured, adaptively refined rectangular grids.”

In my opinion, Spook Country is over-exploiting the same stylistic lines as those earlier books with very short chapters, an abundance of brands (Apple at the forefront!), a central role of technology and virtual reality, and three characterial threads interweaved. However, the story does not click in. There are too many improbable coincidences and the characters are definitely caricatural, while reminding me of the previous books: the female artist drawn into investigation for lack of money, the geek computer genius, the woodoo inspired ninja-like thug, the tough CIA spook, the media executive with unlimited wealth… Without going into spoilers, the plot is fairly thin, with those three different groups chasing after the same container, and obviously ending up together. The technological inventivity of the previous novels has disappeared as well—the above quote about Chombo is taken verbatim from the Berkeley Lab website!—, which may explain why William Gibson does not intend to continue writing sci’-fi’ novels.

## Runnin’ no more?!

Posted in Running on September 26, 2009 by xi'an

Even though I was beginning to feel the benefits of my summer training during the split sessions I started four weeks ago, my knee tendinitis struck again during the first series of 1000 meters at 3:30… Following the advice of my fellow Insee Paris Club runners, I then stopped the training and gave up the idea of running Argentan this year. I also checked with a sport doctor to understand about those repeated tendinitis and he diagnosed a sciatica and a lower spine disc protusion, confirmed by a scan… Although it seems to be highly frequent with runners, this is a major bummer, obviously, because I do not know how long this is going to take to heal and whether or not I will be able to do long distance runs any more.

## PhD thesis proposals

Posted in Statistics, University life with tags , , , , on September 25, 2009 by xi'an

I just received two emails with PdD thesis proposals in Copenhagen

The Informatics and Mathematical Modeling Department of the Technical University of Denmark in Copenhagen funds every year 5 to 10 PhD students. Several PhD subjects are currently available in the following areas: Spatial statistics, Bayesian modeling, Computational statistical methods, R packages development, Statistical genetics and molecular ecology, Environmental statistics, Statistical analysis of microarray data.

Salary: varies between 4100 and 5300 euros depending on experience. (net salary = gross salary minus 17% pension contribution minus 33-46% taxes).

Deadline: October 14 2009. Students interested by one of these topics should contact asap Gilles Guillot (gigu ã imm.dtu.dk) with a detailed cv (with grades obtained in the last 2-3 years) as soon as possible.

and in Avignon

A PhD position is available for candidates interested in building models and statistical methods for biology. The candidate will be funded by the French National Institute for Agricultural Research (INRA), he will be hosted in the research unit Biostatistics and Spatial Processes at Avignon and associated to the new ANR funded project EMILE concerning inference methods and softwares for evolution. The position is for three years and the student will defend his PhD thesis at the University of Montpellier II or Avignon. The application is open to students from any country with a master degree in statistics, applied mathematics, biomathematics or with a master in population biology or epidemiology and a strong interest for modelling and statistics. Speaking languages required: French or English.
Starting Date: 1 November 2009 or as soon as possible thereafter (until the beginning of January 2010).
Salary: 16,800 euros per year. In addition, the student will be covered by the French health care system.
A letter of application, a CV detailing the skills in statistics, mathematics or biology, and the report of the master internship if applicable should be sent to E. Klein (Etienne.Klein ã avignon.inra.fr) and S.Soubeyrand (Samuel.Soubeyrand ã avignon.inra.fr) as soon as possible.

Of course, salary_wise, the position at INRA does not stand the comparison!, but then it is in Avignon, Provence! And more seriously, the topic is quite focussed about the quickly emerging ABC methodology, in connection with our ANR grant EMILE.