Archive for September, 2010

Busy Mondays

Posted in Statistics, University life with tags , , , , on September 30, 2010 by xi'an

There is now a combination of no less than three generic statistics seminars alternating on Monday afternoons in Paris:

Fortunately, Pierre Alquier has made a google calendar that summarises where and when to go!

Le Monde puzzle [38]

Posted in R, University life with tags , , , on September 30, 2010 by xi'an

Since I have resumed my R class, I will restart my resolution of Le Monde mathematical puzzles…as they make good exercises for the class. The puzzle this week is not that exciting:

Find the four non-zero different digits a,b,c,d such that abcd is equal to the sum of all two digit numbers made by picking without replacement two digits from {a,b,c,d}.

The (my) dumb solution is to proceed by enumeration

for (a in 1:9){
for (b in (1:9)[-a]){
for (c in (1:9)[-c(a,b)]){
for (d in (1:9)[-c(a,b,c)]){
 if (231*sum(c(a,b,c,d))==sum(10^(0:3)*c(a,b,c,d)))
    print(c(a,b,c,d))
}}}}

taking advantage of the fact that the sum of all two-digit numbers is (30+4-1) times the sum a+b+c+d, but there is certainly a cleverer way to solve the puzzle (even though past experience has shown that this was not always the case!)

Incoherent phyleogeographic inference [reply]

Posted in Statistics, University life with tags , , , , , , , , , on September 29, 2010 by xi'an

“Logical overlap is the norm for the complex models analyzed with ABC, so many ABC posterior model probabilities published to date are wrong.” Alan R. Templeton, PNAS, doi:10.1073/pnas.1009012107

Our letter in PNAS about Templeton’s surprising diatribe on Bayesian inference is now appeared in the early edition, along with Templeton’s reply. This reply is unfortunately missing any novelty element compared with the original paper. First, he maintains that the critcism is about ABC (which is, in case you do not know, a computational technique and not a specific statistical methodology!). Second, he insists on the inappropriate Venn diagram analogy by reproducing the basic identity

P(A\cup B\cup C) = P(A)+P(B)+P(C)-P(A\cap B)-P(B\cap C)-P(C\cap A)+P(A\cap B\cap C)

(presumably in case we had lost sight of it!) to argue that using instead

P(A)+P(B)+P(C)

is incoherent (hence rejecting Bayes factors, Bayesian model averaging and so on). I am not particularly surprised by this immutable stance, but it means that there is little point in debate when starting from such positions… Our main goal in publishing this letter was actually to stress that the earlier tribune had no statistical ground and I think we achieved this goal.

Publication from the frontier

Posted in Books, Statistics, Travel with tags , , , on September 29, 2010 by xi'an

In conjunction with the conference in San Antonio last March, I have received the book Frontiers of Statistical Decision Making and Bayesian Analysis: In Honor of James O. Berger edited by Ming-Hui Chen (University of Connecticut), Dipak K. Dey (University of Connecticut), Peter Müller (University of Texas M. D. Anderson Cancer Center), Dongchu Sun (University of Missouri- Columbia) and Keying Ye (University of Texas at San Antonio), who, incidentally, were are PhD students of Jim Berger at the time I visited Purdue University. The book has been edited in depth and so it reads very well, with contributions regrouped by chapters. Here is the table of contents:

  1. Introduction.
  2. Objective Bayesian inference with applications.
  3. Bayesian decision based estimation and predictive inference.
  4. Bayesian model selection and hypothesis tests.
  5. Bayesian computer models.
  6. Bayesian nonparametrics and semi-parametrics.
  7. Bayesian case influence and frequentist interface.
  8. Bayesian clinical trials.
  9. Bayesian methods for genomics, molecular, and systems biology.
  10. Bayesian data mining and machine learning.
  11. Bayesian inference in political and social sciences, finance, and marketing.
  12. Bayesian categorical data analysis.
  13. Bayesian geophysical, spatial, and temporal statistics.
  14. Posterior simulation and Monte Carlo methods.

whose final chapter (the only one missing Bayesian from the title!) contains our contribution with Jean-Michel Marin.

Galton & simulation

Posted in Books, R, Statistics with tags , , , , , , , , on September 28, 2010 by xi'an

Stephen Stigler has written a paper in the Journal of the Royal Statistical Society Series A on Francis Galton’s analysis of (his cousin) Charles Darwin’ Origin of Species, leading to nothing less than Bayesian analysis and accept-reject algorithms!

“On September 10th, 1885, Francis Galton ushered in a new era of Statistical Enlightenment with an address to the British Association for the Advancement of Science in Aberdeen. In the process of solving a puzzle that had lain dormant in Darwin’s Origin of Species, Galton introduced multivariate analysis and paved the way towards modern Bayesian statistics. The background to this work is recounted, including the recognition of a failed attempt by Galton in 1877 as providing the first use of a rejection sampling algorithm for the simulation of a posterior distribution, and the first appearance of a proper Bayesian analysis for the normal distribution.”

The point of interest is that Galton proposes through his (multi-stage) quincunx apparatus a way to simulate from the posterior of a normal mean (here is an R link to the original quincunx). This quincunx has a vertical screen at the second level that acts as a way to physically incorporate the likelihood (it also translates the fact that the likelihood is in another “orthogonal” space, compared  with the prior!):

“Take another look at Galton’s discarded 1877 model for natural selection (Fig. 6). It is nothing less that a workable simulation algorithm for taking a normal prior (the top level) and a normal likelihood (the natural selection vertical screen) and finding a normal posterior (the lower level, including the rescaling as a probability density with the thin front compartment of uniform thickness).”

Besides a simulation machinery (steampunk Monte Carlo?!), it also offers the enormous appeal of proposing the derivation of the normal-normal posterior for the very first time:

“Galton was not thinking in explicit Bayesian terms, of course, but mathematically he has posterior \mathcal{N}(0,C_2)\propto\mathcal{N}(0,A_2)\times f(x=0|y). This may be the earliest appearance of this calculation; the now standard derivation of a posterior distribution in a normal setting with a proper normal prior. Galton gave the general version of this result as part of his 1885 development, but the 1877 version can be seen as an algorithm employing rejection sampling that could be used for the generation of values from a posterior distribution. If we replace f(x) above by the density \mathcal{N}(a,B_2), his algorithm would generate the posterior distribution of Y given X=a, namely \mathcal{N}(aC_2/B_2, C_2). The assumption of normality is of course needed for the particular formulae here, but as an algorithm the normality is not essential; posterior values for any prior and any location parameter likelihood could in principle be generated by extending this algorithm.” Continue reading

Follow

Get every new post delivered to your Inbox.

Join 598 other followers