Archive for Bayesian data analysis

big Bayes stories

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , on July 29, 2013 by xi'an

(The following is our preface to the incoming “Big Bayes stories” special issue of Statistical Science, edited by Sharon McGrayne, Kerrie Mengersen and myself.)

Bayesian statistics is now endemic in many areas of scienti c, business and social research. Founded a quarter of a millenium ago, the enabling theory, models and computational tools have expanded exponentially in the past thirty years. So what is it that makes this approach so popular in practice? Now that Bayesian statistics has “grown up”, what has it got to show for it- self? In particular, what real-life problems has it really solved? A number of events motivated us to ask these questions: a conference in honour of Adrian Smith, one of the founders of modern Bayesian Statistics, which showcased a range of research emanating from his seminal work in the field, and the impressive book by Sharon McGrayne, the theory that would not die. At a café in Paris in 2011, we conceived the idea of gathering a similar collection of “Big Bayes stories”, that would demonstrate the appeal of adopting a Bayesian modelling approach in practice. That is, we wanted to collect real cases in which a Bayesian approach had made a significant di fference, either in addressing problems that could not be analysed otherwise, or in generating a new or deeper understanding of the data and the associated real-life problem.

After submitting this proposal to Jon Wellner, editor of Statistical Science, and obtaining his encouragement and support, we made a call for proposals. We received around 30 submissions (for which authors are to be warmly thanked!) and after a regular review process by both Bayesian and non-Bayesian referees (who are also deeply thanked), we ended up with 17 papers that reflected the type of stories we had hoped to hear. Sharon McGrayne, then read each paper with the utmost attention and provided helpful and encouraging comments on all. Sharon became part the editorial team in acknowledgement of this substantial editing contribution, which has made the stories much more enjoyable. In addition, referees who handled several submissions were asked to contribute discussions about the stories and some of them managed to fi nd additional time for this task, providing yet another perspective on the stories..

Bayesian Estimation of Population – Level Trends in Measures of Health Status Mariel M. Finucane, Christopher J. Paciorek, Goodarz Danaei, and Majid Ezzati
Galaxy Formation: Bayesian History Matching for the Observable Universe Ian Vernon, Michael Goldstein, and Richard G Bower
Estimating the Distribution of Dietary Consumption Patterns Raymond James Carroll
Bayesian Population Projections for the United Nations Adrian E. Raftery, Leontine Alkema, and Patrick Gerland
From Science to Management: Using Bayesian Networks to Learn about Lyngbya Sandra Johnson, Eva Abal, Kathleen Ahern, and Grant Hamilton
Search for the Wreckage of Air France Flight AF 447 Lawrence D Stone, Colleen M. Keller, Thomas M Kratzke, and Johan P Strumpfer
Finding the most distant quasars using Bayesian selection methods Daniel Mortlock
Estimation of HIV burden through Bayesian evidence synthesis Daniela De Angelis, Anne M Presanis, Stefano Conti, and A E Ades
Experiences in Bayesian Inference in Baltic Salmon Management Sakari Kuikka, Jarno Vanhatalo, Henni Pulkkinen, Samu Mäntyniemi, and Jukka Corander

As can be gathered from the table of contents, the spectrum of applications ranges across astronomy, epidemiology, ecology and demography, with the special case of the Air France wreckage story also reported in the paper- back edition of the theory that would not die. What made those cases so well suited for a Bayesian solution? In some situations, the prior or the expert opinion was crucial; in others, the complexity of the data model called for a hierarchical decomposition naturally provided in a Bayesian framework; and others involved many actors, perspectives and data sources that only Bayesian networks could aggregate. Now, before or (better) after reading those stories, one may wonder whether or not the “plus” brought by the Bayesian paradigm was truly significant. We think they did, at one level or another of the statistical analysis, while we acknowledge that in several cases other statistical perspectives or even other disciplines could have brought another solution, but presumably at a higher cost.

Now, before or (better) after reading those stories, one may wonder whether or not the \plus” brought by the Bayesian paradigm was truly signifi cant. We think it did, at one level or another of the statistical analysis, while we acknowledge that in several cases other statistical perspectives or even other disciplines could have provided another solution, but presumably at a higher cost. We think this collection of papers constitutes a worthy tribute to the maturity of the Bayesian paradigm, appropriate for commemorating the 250th anniversary of the publication of Bayes’ Essay towards solving a Problem in the Doctrine of Chances. We thus hope you will enjoy those stories, whether or not Bayesiana is your statistical republic.

the signal and the noise

Posted in Books, Statistics with tags , , , , , , , , , , , on February 27, 2013 by xi'an

It took me a while to get Nate Silver’s the signal and the noise: why so many predictions fall – but some don’t (hereafter s&n) and another while to read it (blame A Memory of Light!).

“Bayes and Price are telling Hume, don’t blame nature because you are too daft to understand it.” s&n, p.242

I find s&n highly interesting and it is rather refreshing to see the Bayesian approach so passionately promoted by a former poker player, as betting and Dutch book arguments have often been used as argument in favour of this approach. While it works well for some illustrations in the book, like poker and the stock market, as well as political polls and sports, I prefer more decision theoretic motivations for topics like weather prediction, sudden epidemics, global warming or terrorism. Of course, this passionate aspect makes s&n open to criticisms, like this one by Marcus and Davies in The New Yorker about seeing everything through the Bayesian lenses. The chapter on Bayes and Bayes’ theorem (Chapter 8) is a wee caricaturesque in this regard. Indeed, Silver sees too much in Bayes’ Essay, to the point of mistakenly attributing to Bayes a discussion of Hume’s sunrise problem. (The only remark is made in the Appendix, which was written by Price—like possibly the whole of the Essay!—, and  P.S. Laplace is the one who applied Bayesian reasoning to the problem, leading to Laplace’s succession rule.) The criticisms of frequentism are also slightly over-the-levee: they are mostly directed at inadequate models that a Bayesian analysis would similarly process in the wrong way. (Some critics argue on the opposite that Bayesian analysis is too much dependent on the model being “right”! Or on the availability of a fully-specified  model.) Seeing frequentism as restricted to “collecting data among just a sample of the population rather than the whole population” (p.252) is certainly not presenting a broad coverage of frequentism.

“Prediction serves a very central role in hypothesis testing, for instance, and therefore in all of science.” s&n, p.230

The book is written in a fairly enjoyable style, highly personal (no harm with that) and apart from superlativising (!) everyone making a relevant appearance—which seems the highest common denominator of all those pop’sci’ books I end up reviewing so very often!, maybe this is something like Rule #1 in Scientific Writing 101 courses: “makes the scientists sound real, turn’em into real people”—, I find it rather well-organised as it brings the reader from facts (prediction usually does poorly) to the possibility of higher quality prediction (by acknowledging prior information, accepting uncertainty, using all items of information available, further accepting uncertainty, &tc.). I am not sure the reader is the wiser by the end of the book on how one should improve one’s prediction tools, but there is a least a warning about the low quality of most predictions and predictive tools that should linger in the reader’s ears…. I enjoyed very much the chapter on chess, esp. the core about Kasparov’s misreading the computer reasons for a poor move (no further spoiler!), although I felt it was not much connected to the rest of the book.

In his review, Larry Wasserman argues that the defence Silver makes of his procedure is more frequentist than Bayesian. Because he uses calibration and long-term performances. Well… Having good calibration properties does not mean the procedure is not Bayesian or frequentist, simply that it is making efficient use of the available information. Anyway, I agree (!) with Larry on the point that Silver somehow “confuses “Bayesian inference” with “using Bayes’ theorem”. Or puts too much meaning in the use of Bayes’ theorem, not unlike the editors of Science & Vie a few months ago. To push Larry’s controversial statement a wee further, I would even wonder whether the book has anything to do about inference. Indeed, in the end, I find s&n rather uninformative about statistical modelling and even more (or less!) about model checking. The only “statistical” model that is truly discussed over the book is the power law distribution, applied to earthquakes and terrorist attack fatalities. This is not an helpful model in that (a) it does not explain anything, as it does not make use of covariates or side information, and (b) it has no predictive power, especially in the tails.  On the first point, concluding that Israel’s approach to counter-terrorism is successful because it “is the only country that has been able to bend” the power-law curve (p.442) sounds rather hasty. I’d like to see the same picture for Iraq, say. Actually, I found one in this arXiv paper. And it looks about the same for Afghanistan (Fig.4). On the second point, the modelling is poor in handling extreme values (which are the ones of interest in both cases) and cannot face change-points or lacks of stationary, an issue not sufficiently covered in s&n in my opinion. The difficulty with modelling volatile concepts like the stock market, the next presidential election or the move of your poker opponents is that there is no physical, immutable, law at play. Things can change from one instant to the next. Unpredictably. Esp. in the tails.

There are plenty of graphs in s&n, which is great, but not all of them are at the Tufte quality level. For instance, Figure 11-1 about the “average time U.S. common stock was held” contains six pie charts corresponding to six decades with the average time and a percentage which could be how long compared with the 1950s a stock was held. The graph is not mentioned in the text. (I will not mention Figure 8-2!) I also spotted a minuscule typo (`probabalistic’) on Figure 10-2A.

Maybe one last and highly personal remark about the chapter on poker (feel free to skip!): while I am a very poor card player, I do not mind playing cards (and loosing) with my kids. However, I simply do not understand the rationale of playing poker. If there is no money at stake, the game does not seem to make sense since every player can keep bluffing until the end of time. And if there is money at stake, I find the whole notion unethical. This is a zero sum game, so money comes from someone else’s pocket (or more likely someone else’s retirement plan or someone else’s kids college savings plan). Not much difference with the way the stock market behaves nowadays… (Incidentally, this chapter did not discuss at all the performances of computer poker programs, unexpectedly, as the number of possibilities is very small and they should thus be fairly efficient.)

the BUGS Book [guest post]

Posted in Books, R, Statistics with tags , , , , , , , , , on February 25, 2013 by xi'an

(My colleague Jean-Louis Fouley, now at I3M, Montpellier, kindly agreed to write a review on the BUGS book for CHANCE. Here is the review, en avant-première! Watch out, it is fairly long and exhaustive! References will be available in the published version. The additions of book covers with BUGS in the title and of the corresponding Amazon links are mine!)

If a book has ever been so much desired in the world of statistics, it is for sure this one. Many people have been expecting it for more than 20 years ever since the WinBUGS software has been in use. Therefore, the tens of thousands of users of WinBUGS are indebted to the leading team of the BUGS project (D Lunn, C Jackson, N Best, A Thomas and D Spiegelhalter) for having eventually succeeded in finalizing the writing of this book and for making sure that the long-held expectations are not dashed.

As well explained in the Preface, the BUGS project initiated at Cambridge was a very ambitious one and at the forefront of the MCMC movement that revolutionized the development of Bayesian statistics in the early 90’s after the pioneering publication of Gelfand and Smith on Gibbs sampling.

This book comes out after several textbooks have already been published in the area of computational Bayesian statistics using BUGS and/or R (Gelman and Hill, 2007; Marin and Robert, 2007; Ntzoufras, 2009; Congdon, 2003, 2005, 2006, 2010; Kéry, 2010; Kéry and Schaub, 2011 and others). It is neither a theoretical book on foundations of Bayesian statistics (e.g. Bernardo and Smith, 1994; Robert, 2001) nor an academic textbook on Bayesian inference (Gelman et al, 2004, Carlin and Louis, 2008). Instead, it reflects very well the aims and spirit of the BUGS project and is meant to be a manual “for anyone who would like to apply Bayesian methods to real-world problems”.

In spite of its appearance, the book is not elementary. On the contrary, it addresses most of the critical issues faced by statisticians who want to apply Bayesian statistics in a clever and autonomous manner. Although very dense, its typical fluid British style of exposition based on real examples and simple arguments helps the reader to digest without too much pain such ingredients as regression and hierarchical models, model checking and comparison and all kinds of more sophisticated modelling approaches (spatial, mixture, time series, non linear with differential equations, non parametric, etc…).

The book consists of twelve chapters and three appendices specifically devoted to BUGS (A: syntax; B: functions and C: distributions) which are very helpful for practitioners. The book is illustrated with numerous examples. The exercises are well presented and explained, and the corresponding code is made available on a web site. Continue reading

on using the data twice…

Posted in Books, Statistics, University life with tags , , , , , , , , on January 13, 2012 by xi'an

As I was writing my next column for CHANCE, I decided I will include a methodology box about “using the data twice”. Here is the draft. (The second part is reproduced verbatim from an earlier post on Error and Inference.)

Several aspects of the books covered in this CHANCE review [i.e., Bayesian ideas and data analysis, and Bayesian modeling using WinBUGS] face the problem of “using the data twice”. What does that mean? Nothing really precise, actually. The accusation of “using the data twice” found in the Bayesian literature can be thrown at most procedures exploiting the Bayesian machinery without actually being Bayesian, i.e.~which cannot be derived from the posterior distribution. For instance, the integrated likelihood approach in Murray Aitkin’s Statistical Inference avoids the difficulties related with improper priors πi by first using the data x to construct (proper) posteriors πii|x) and then secondly using the data in a Bayes factor

\int_{\Theta_1}f_1(x|\theta_1) \pi_1(\theta_1|x)\,\text{d}\theta_1\bigg/ \int_{\Theta_2}f_2(x|\theta_2)\pi_2(\theta_2|x)\,\text{d}\theta_2

as if the posteriors were priors. This obviously solves the improperty difficulty (see. e.g., The Bayesian Choice), but it creates a statistical procedure outside the Bayesian domain, hence requiring a separate validation since the usual properties of Bayesian procedures do not apply. Similarly, the whole empirical Bayes approach falls under this category, even though some empirical Bayes procedures are asymptotically convergent. The pseudo-marginal likelihood of Geisser and Eddy (1979), used in  Bayesian ideas and data analysis, is defined by

\hat m(x) = \prod_{i=1}^n f_i(x_i|x_{-i})

through the marginal posterior likelihoods. While it also allows for improper priors, it does use the same data in each term of the product and, again, it is not a Bayesian procedure.

Once again, from first principles, a Bayesian approach should use the data only once, namely when constructing the posterior distribution on every unknown component of the model(s).  Based on this all-encompassing posterior, all inferential aspects should be the consequences of a sequence of decision-theoretic steps in order to select optimal procedures. This is the ideal setting while, in practice,  relying on a sequence of posterior distributions is often necessary, each posterior being a consequence of earlier decisions, which makes it the result of a multiple (improper) use of the data… For instance, the process of Bayesian variable selection is on principle clean from the sin of “using the data twice”: one simply computes the posterior probability of each of the variable subsets and this is over. However, in a case involving many (many) variables, there are two difficulties: one is about building the prior distributions for all possible models, a task that needs to be automatised to some extent; another is about exploring the set of potential models. First, ressorting to projection priors as in the intrinsic solution of Pèrez and Berger (2002, Biometrika, a much valuable article!), while unavoidable and a “least worst” solution, means switching priors/posteriors based on earlier acceptances/rejections, i.e. on the data. Second, the path of models truly explored by a computational algorithm [which will be a minuscule subset of the set of all models] will depend on the models rejected so far, either when relying on a stepwise exploration or when using a random walk MCMC algorithm. Although this is not crystal clear (there is actually plenty of room for supporting the opposite view!), it could be argued that the data is thus used several times in this process…

Another handbook chapter

Posted in Books, Statistics with tags , , , , , on February 11, 2010 by xi'an

As I have received over the past semester half a dozen requests for contributing chapters in different handbooks, I wrote several rather similar introductions to Bayesian statistics and/or to computational statistics. Here is one for an Handbook of Statistical Systems Biology edited by D. Balding, M. Stumpf, and M. Girolami, to be published by Wiley. It is mostly inspired from the second chapter of Bayesian Core so it is not particularly novel. If I find some extra time within the coming months, I will also include a section on nonparametric Bayes… Before, I also have to write a revised edition to my chapter Bayesian Computational Methods in the Handbook of Computational Statistics (selling at an outrageous price, like most handbooks!), edited by J. Gentle, W. Härdle and Y. Mori.