Archive for Statistical Science

back from down under

Posted in Books, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on August 30, 2012 by xi'an

After a sunny weekend to unpack and unwind, I am now back to my normal schedule, on my way to Paris-Dauphine for an R (second-chance) exam. Except for confusing my turn signal for my wiper, thanks to two weeks of intensive driving in four Australian states!, things are thus back to “normal”, meaning that I have enough of a control of my time to handle both daily chores like the R exam and long-term projects. Including the special issues of Statistical Science, TOMACS, and CHANCE (reviewing all books of George Casella in memoriam). And the organisation of MCMSki 4, definitely taking place in Chamonix on January 6-8, 2014, hopefully under the sponsorship of the newly created BayesComp section of ISBA. And enough broadband to check my usual sites and to blog ad nauseam.

This trip to Australia, along the AMSI Lectures as well as the longer visits to Monash and QUT, has been quite an exciting time, with many people met and ideas discussed. I came back with a (highly positive) impression of Australian universities as very active places, just along my impression of Australia being a very dynamic and thriving country, far far away from the European recession. I was particularly impressed by the number of students within Kerrie Mengersen’s BRAG group, when we did held discussions in classrooms that felt full like a regular undergrad class! Those discussions and meetings set me towards a few new projects along the themes of mixture estimation and model choice, as well as convergence assessment. During this trip, I however also felt the lack of long “free times” I have gotten used to, thanks to the IUF chair support, where I can pursue a given problem for a few hours without interruption. Which means that I did not work as much as I wanted to during this tour and will certainly avoid such multiple-step trips in a near future. Nonetheless, overall, the own under” experience was quite worth it! (Even without considering the two weeks of vacations I squeezed in the middle.)

Back to “normal” also means I already had two long delays caused by suicides on my train line…

ASC 2012 (#3, also available by mind-reading)

Posted in Running, Statistics, University life with tags , , , , , , , , , on July 13, 2012 by xi'an

This final morning at the ASC 2012 conference in Adelaide, I attended a keynote lecture by Sophia Rabe-Hesketh on GLMs that I particularly appreciated, as I am quite fond of those polymorphous and highly adaptable models (witness the rich variety of applications at the INLA conference in Trondheim last month). I then gave my talk on ABC model choice, trying to cover the three episodes in the series within the allocated 40 minutes (and got from Terry Speed the trivia information that Renfrey Potts, father to the Potts model, spent most of his life in Adelaide, where he died in 2005! Terry added that he used to run along the Torrens river, being a dedicated marathon runner. This makes Adelaide the death place of both R.A. Fisher and R. Potts.)

Later in the morning, Christl Donnelly  gave a fascinating talk on her experiences with government bodies during the BSE and foot-and-mouth epidemics in Britain in the past decades. It was followed by  a frankly puzzling [keynote Ozcots] talk delivered by Jessica Utts on the issue of parapsychology tests, i.e. the analysis of experiments testing for “psychic powers”. Nothing less. Actually, I first thought this was a pedagogical trick to capture the attention of students and debunk, however Utts’ focus on exhibiting such “powers” was definitely dead serious and she concluded that “psychic functioning appears to be a real effect”. So it came as a shock that she was truly believing in psychic paranormal abilities! I had been under the wrong impression that the 2005 Statistical Science paper of hers was demonstrating the opposite but it clearly belongs to the tradition of controversial Statistical Science that started with the Bible code paper… I also found it flabbergasting to learn that the U.S. Army is/was funding research in this area and is/was actually employing “psychics”, as well that the University of Edinburgh has a parapsychology unit within the department of psychology. (But, after all, UK universities also have long had schools of Divinity, so let the irrational in a while ago!) Continue reading

A Tribute to Charles Stein

Posted in Statistics, University life with tags , , , , , , on March 28, 2012 by xi'an

Statistical Science just ran a special issue (Feb. 2012) as a tribute to Charles Stein that focused on shrinkage estimation. Shrinkage and the Stein effect have been my entries to the Bayesian (wonderful) world, so I read through this series of papers edited by Ed George and Bill Strawderman with fond remembrance. The more because most of the authors are good friends! Jim Berger, Bill Jefferys, and Peter Müller consider shrinkage estimation for wavelet coefficients and applies it to Cepheid variable stars. The paper by Ann Brandwein and Bill Strawderman is a survey of shrinkage estimation and the Stein effect for spherically elliptical distributions, precisely my PhD thesis topic and main result! Larry Brown and Linda Shao give a geometric interpretation of the original Stein (1956) paper. Tony Cai discusses the concepts of minimaxity and shrinkage estimators in functional spaces. George Casella and Juinn Gene Hwang recall the impact of shrinkage estimation on confidence sets. Dominique Fourdrinier and Marty Wells give an expository development of loss estimation using shrinkage estimators. Ed George, Feng Liang and Xinyi Xu recall how shrinkage estimation was recently extended to prediction using Kullback-Leibler losses. Carl Morris and Martin Lysy detail the reversed shrinkage defect and Model-II minimaxity in the normal case. Gauri Datta and Malay Ghosh explain how shrinkage estimators are paramount in small area estimation, providing a synthesis between both the Bayesian and the frequentist points of view. At last, Michael Perlman and Sanjay Chaudhuri reflect on the reversed shrinkage effect, providing us with several pages of Star Trek dialogues on this issue, and more seriously voicing a valid Bayesian reservation!

Improving convergence properties of the Data Augmentation algorithm

Posted in Statistics, University life with tags , , , , on February 7, 2012 by xi'an

Our paper Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modeling, written with Jim Hobert and Vivek Roy during their latest visit to Paris a few years ago (!), has now appeared in arXiv, Statistical Science and project Euclid. I already mentioned in a previous post why this is an important paper for me. (There is nothing new  here, compared with this earlier post, except that the paper was re-posted on arXiv!)

Improving convergence of Data Augmentation [published]

Posted in Statistics with tags , , , , on November 4, 2011 by xi'an

Our paper with Jim Hobert and Vivek Roy, Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modeling, has now appeared in Statistical Science and is available on Project Euclid. (For IMS members, at least.) Personally, this is an important paper, not only for providing an exact convergence evaluation for mixtures,  not only for sharing exciting research days with my friends Jim and Vivek, but also for finalising a line of research somehow started in 1993 when Richard Tweedie visited me in Paris and when I visited him in Fort Collins… Coincidentally, my discussion of Don Fraser’s provocative Is Bayes Posterior just Quick and Dirty Confidence? also appeared in this issue of Statistical Science.

Don Fraser’s rejoinder

Posted in Books, Statistics, University life with tags , , , , , , on August 24, 2011 by xi'an

“How can a discipline, central to science and to critical thinking, have two methodologies, two logics, two approaches that frequently give substantially different answers to the same problems. Any astute person from outside would say “Why don’t they put their house in order?”” Don Fraser

Following the discussions of his Statistical Science paper Is Bayes posterior just quick and dirty confidence?, by Kesar Singh and Minge Xie, Larry Wasserman (who coined the neologism Frasian for the occasion), Tong Zhang, and myself, Don Fraser has written his rejoinder to the discussion (although in Biometrika style it is for Statistical Science!). His conclusion that “no one argued that the use of the conditional probability lemma with an imaginary input had powers beyond confidence, supernatural powers” is difficult to escape, as I would not dream of promoting a super-Bayes jumping to the rescue of bystanders misled by evil frequentists!!! More seriously, this rejoinder makes me reflect on lectures from the past years, from those on the diverse notions of probability (Jeffreys, Keynes, von Mises, and Burdzy) to those on scientific discovery (mostly Seber‘s, and the promising Error and Inference by Mayo and Spanos I just received).

Bayes factors and martingales

Posted in R, Statistics with tags , , , on August 11, 2011 by xi'an

A surprising paper came out in the last issue of Statistical Science, linking martingales and Bayes factors. In the historical part, the authors (Shafer, Shen, Vereshchagin and Vovk) recall that martingales were popularised by Martin-Löf, who is also influential in the theory of algorithmic randomness. A property of test martingales (i.e., martingales that are non negative with expectation one) is that

\mathbb{P}(X^*_t \ge c) = \mathbb{P}(\sup_{s\le t}X_s \ge c) \le 1/c

which makes their sequential maxima p-values of sorts. I had never thought about likelihood ratios this way, but it is true that a (reciprocal) likelihood ratio

\prod_{i=1}^n \dfrac{q(x_i)}{p(x_i)}

is a martingale when the observations are distributed from p.  The authors define a Bayes factor (for P) as satisfying (Section 3.2)

\int (1/B) \text{d}P \le 1

which I find hard to relate to my understanding of Bayes factors because there is no prior nor parameter involved. I first thought there was a restriction to simple null hypotheses. However, there is a composite versus composite example (Section 8.5, Binomial probability being less than or large than 1/2). So P would then be the marginal likelihood. In this case the test martingale is

X_t = \dfrac{P(B_{t+1}\le S_t)}{P(B_{t+1}\ge S_t)}\,, \quad B_t \sim \mathcal{B}(t,1/2)\,,\, S_t\sim \mathcal{B}(t,\theta)\,.

Simulating the martingale is straightforward, however I do not recover the picture they obtain (Fig. 6):

x=sample(0:1,10^4,rep=TRUE,prob=c(1-theta,theta))
s=cumsum(x)
ma=pbinom(s,1:10^4,.5,log.p=TRUE)-pbinom(s-1,1:10^4,.5,log.p=TRUE,lower.tail=FALSE)
plot(ma,type="l")
lines(cummin(ma),lty=2) #OR lines(cummin(ma),lty=2)
lines(log(0.1)+0.9*cummin(ma),lty=2,col="steelblue") #OR cummax

When theta is not 1/2, the sequence goes down almost linearly to -infinity.

but when theta is 1/2, I more often get a picture where max and min are obtained in the first steps:

Obviously, I have not read the paper with the attention it deserved, so there may be features I missed that could be relevant for the Bayesian analysis of the behaviour of Bayes factors. However, at this stage, I fail to see the point of the “Puzzle for Bayesians” (Section 8.6) since the conclusion that “it is legitimate to collect data until a point has been disproven but not legitimate to interpret this data as proof of an alternative hypothesis within the model” is not at odds with a Bayesian interpretation of the test outcome: when the Bayes factor favours a model, it means this model is the most likely of the two given the data, not this model is true.