Archive for Bayesian data analysis

Bayesian Data Analysis [BDA3 – part #2]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , on March 31, 2014 by xi'an

Here is the second part of my review of Gelman et al.’ Bayesian Data Analysis (third edition):

“When an iterative simulation algorithm is “tuned” (…) the iterations will not in general converge to the target distribution.” (p.297)

Part III covers advanced computation, obviously including MCMC but also model approximations like variational Bayes and expectation propagation (EP), with even a few words on ABC. The novelties in this part are centred at Stan, the language Andrew is developing around Hamiltonian Monte Carlo techniques, a sort of BUGS of the 10’s! (And of course Hamiltonian Monte Carlo techniques themselves. A few (nit)pickings: the book advises important resampling without replacement (p.266) which makes some sense when using a poor importance function but ruins the fundamentals of importance sampling. Plus, no trace of infinite variance importance sampling? of harmonic means and their dangers? In the Metropolis-Hastings algorithm, the proposal is called the jumping rule and denoted by Jt, which, besides giving the impression of a Jacobian, seems to allow for time-varying proposals and hence time-inhomogeneous Markov chains, which convergence properties are much hairier. (The warning comes much later, as exemplified in the above quote.) Moving from “burn-in” to “warm-up” to describe the beginning of an MCMC simulation. Being somewhat 90’s about convergence diagnoses (as shown by the references in Section 11.7), although the book also proposes new diagnoses and relies much more on effective sample sizes. Particle filters are evacuated in hardly half-a-page. Maybe because Stan does not handle particle filters. A lack of intuition about the Hamiltonian Monte Carlo algorithms, as the book plunges immediately into a two-page pseudo-code description. Still using physics vocabulary that put me (and maybe only me) off. Although I appreciated the advice to check analytical gradients against their numerical counterpart.

“In principle there is no limit to the number of levels of variation that can be handled in this way. Bayesian methods provide ready guidance in handling the estimation of the unknown parameters.” (p.381)

I also enjoyed reading the part about modes that stand at the boundary of the parameter space (Section 13.2), even though I do not think modes are great summaries in Bayesian frameworks and while I do not see how picking the prior to avoid modes at the boundary avoids the data impacting the prior, in fine. The variational Bayes section (13.7) is equally enjoyable, with a proper spelled-out illustration, introducing an unusual feature for Bayesian textbooks.  (Except that sampling without replacement is back!) Same comments for the Expectation Propagation (EP) section (13.8) that covers brand new notions. (Will they stand the test of time?!)

“Geometrically, if β-space is thought of as a room, the model implied by classical model selection claims that the true β has certain prior probabilities of being in the room, on the floor, on the walls, in the edge of the room, or in a corner.” (p.368)

Part IV is a series of five chapters about regression(s). This is somewhat of a classic, nonetheless  Chapter 14 surprised me with an elaborate election example that dabbles in advanced topics like causality and counterfactuals. I did not spot any reference to the g-prior or to its intuitive justifications and the chapter mentions the lasso as a regularisation technique, but without any proper definition of this “popular non-Bayesian form of regularisation” (p.368). In French: with not a single equation! Additional novelty may lie in the numerical prior information about the correlations. What is rather crucially (cruelly?) missing though is a clearer processing of variable selection in regression models. I know Andrew opposes any notion of a coefficient being exactly equal to zero, as ridiculed through the above quote, but the book does not reject model selection, so why not in this context?! Chapter 15 on hierarchical extensions stresses the link with exchangeability, once again. With another neat election example justifying the progressive complexification of the model and the cranks and toggles of model building. (I am not certain the reparameterisation advice on p.394 is easily ingested by a newcomer.) The chapters on robustness (Chap. 17) and missing data (Chap. 18) sound slightly less convincing to me, esp. the one about robustness as I never got how to make robustness agree with my Bayesian perspective. The book states “we do not have to abandon Bayesian principles to handle outliers” (p.436), but I would object that the Bayesian paradigm compels us to define an alternative model for those outliers and the way they are produced. One can always resort to a drudging exploration of which subsample of the dataset is at odds with the model but this may be unrealistic for large datasets and further tells us nothing about how to handle those datapoints. The missing data chapter is certainly relevant to such a comprehensive textbook and I liked the survey illustration where the missing data was in fact made of missing questions. However, I felt the multiple imputation part was not well-presented, fearing readers would not understand how to handle it…

“You can use MCMC, normal approximation, variational Bayes, expectation propagation, Stan, or any other method. But your fit must be Bayesian.” (p.517)

Part V concentrates the most advanced material, with Chapter 19 being mostly an illustration of a few complex models, slightly superfluous in my opinion, Chapter 20 a very short introduction to functional bases, including a basis selection section (20.2) that implements the “zero coefficient” variable selection principle refuted in the regression chapter(s), and does not go beyond splines (what about wavelets?), Chapter 21 a (quick) coverage of Gaussian processes with the motivating birth-date example (and two mixture datasets I used eons ago…), Chapter 22 a more (too much?) detailed study of finite mixture models, with no coverage of reversible-jump MCMC, and Chapter 23 an entry on Bayesian non-parametrics through Dirichlet processes.

“In practice, for well separated components, it is common to remain stuck in one labelling across all the samples that are collected. One could argue that the Gibbs sampler has failed in such a case.” (p.535)

To get back to mixtures, I liked the quote about the label switching issue above, as I was “one” who argued that the Gibbs sampler fails to converge! The corresponding section seems to favour providing a density estimate for mixture models, rather than component-wise evaluations, but it nonetheless mentions the relabelling by permutation approach (if missing our 2000 JASA paper). The section about inferring on the unknown number of components suggests conducting a regular Gibbs sampler on a model with an upper bound on the number of components and then checking for empty components, an idea I (briefly) considered in the mid-1990’s before the occurrence of RJMCMC. Of course, the prior on the components matters and the book suggests using a Dirichlet with fixed sum like 1 on the coefficients for all numbers of components.

“14. Objectivity and subjectivity: discuss the statement `People tend to believe results that support their preconceptions and disbelieve results that surprise them. Bayesian methods tend to encourage this undisciplined mode of thinking.’¨ (p.100)

Obviously, this being a third edition begets the question, what’s up, doc?!, i.e., what’s new [when compared with the second edition]? Quite a lot, even though I am not enough of a Gelmanian exegist to produce a comparision table. Well, for a starter, David Dunson and Aki Vethtari joined the authorship, mostly contributing to the advanced section on non-parametrics, Gaussian processes, EP algorithms. Then the Hamiltonian Monte Carlo methodology and Stan of course, which is now central to Andrew’s interests. The book does include a short Appendix on running computations in R and in Stan. Further novelties were mentioned above, like the vision of weakly informative priors taking over noninformative priors but I think this edition of Bayesian Data Analysis puts more stress on clever and critical model construction and on the fact that it can be done in a Bayesian manner. Hence the insistence on predictive and cross-validation tools. The book may be deemed somewhat short on exercices, providing between 3 and 20 mostly well-developed problems per chapter, often associated with datasets, rather than the less exciting counter-example above. Even though Andrew disagrees and his students at ENSAE this year certainly did not complain, I personally feel a total of 220 exercices is not enough for instructors and self-study readers. (At least, this reduces the number of email requests for solutions! Esp. when 50 of those are solved on the book website.) But this aspect is a minor quip: overall this is truly the reference book for a graduate course on Bayesian statistics and not only Bayesian data analysis.

Bayesian Data Analysis [BDA3]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , on March 28, 2014 by xi'an

Andrew Gelman and his coauthors, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin, have now published the latest edition of their book Bayesian Data Analysis. David and Aki are newcomers to the authors’ list, with an extended section on non-linear and non-parametric models. I have been asked by Sam Behseta to write a review of this new edition for JASA (since Sam is now the JASA book review editor). After wondering about my ability to produce an objective review (on the one hand, this is The Competition  to Bayesian Essentials!, on the other hand Andrew is a good friend spending the year with me in Paris), I decided to jump for it and write a most subjective review, with the help of Clara Grazian who was Andrew’s teaching assistant this year in Paris and maybe some of my Master students who took Andrew’s course. The second edition was reviewed in the September 2004 issue of JASA and we now stand ten years later with an even more impressive textbook. Which truly what Bayesian data analysis should be.

This edition has five parts, Fundamentals of Bayesian Inference, Fundamentals of Bayesian Data Analysis, Advanced Computation, Regression Models, and Non-linear and Non-parametric Models, plus three appendices. For a total of xiv+662 pages. And a weight of 2.9 pounds (1395g on my kitchen scale!) that makes it hard to carry around in the metro…. I took it to Warwick (and then Nottingham and Oxford and back to Paris) instead.

We could avoid the mathematical effort of checking the integrability of the posterior density (…) The result would clearly show the posterior contour drifting off toward infinity.” (p.111)

While I cannot go into a detailed reading of those 662 pages (!), I want to highlight a few gems. (I already wrote a detailed and critical analysis of Chapter 6 on model checking in that post.) The very first chapter provides all the necessary items for understanding Bayesian Data Analysis without getting bogged in propaganda or pseudo-philosophy. Then the other chapters of the first part unroll in a smooth way, cruising on the B highway… With the unique feature of introducing weakly informative priors (Sections 2.9 and 5.7), like the half-Cauchy distribution on scale parameters. It may not be completely clear how weak a weakly informative prior, but this novel notion is worth including in a textbook. Maybe a mild reproach at this stage: Chapter 5 on hierarchical models is too verbose for my taste, as it essentially focus on the hierarchical linear model. Of course, this is an essential chapter as it links exchangeability, the “atom” of Bayesian reasoning used by de Finetti, with hierarchical models. Still. Another comment on that chapter: it broaches on the topic of improper posteriors by suggesting to run a Markov chain that can exhibit improperness by enjoying an improper behaviour. When it happens as in the quote above, fine!, but there is no guarantee this is always the case! For instance, improperness may be due to regions near zero rather than infinity. And a last barb: there is a dense table (Table 5.4, p.124) that seems to run contrariwise to Andrew’s avowed dislike of tables. I could also object at the idea of a “true prior distribution” (p.128), or comment on the trivia that hierarchical chapters seem to attract rats (as I also included a rat example in the hierarchical Bayes chapter of Bayesian Choice and so does the BUGS Book! Hence, a conclusion that Bayesian textbooks are better be avoided by muriphobiacs…)

“Bayes factors do not work well for models that are inherently continuous (…) Because we emphasize continuous families of models rather than discrete choices, Bayes factors are rarely relevant in our approach to Bayesian statistics.” (p.183 & p.193)

Part II is about “the creative choices that are required, first to set up a Bayesian model in a complex problem, then to perform the model checking and confidence building that is typically necessary to make posterior inferences scientifically defensible” (p.139). It is certainly one of the strengths of the book that it allows for a critical look at models and tools that are rarely discussed in more theoretical Bayesian books. As detailed in my  earlier post on Chapter 6, model checking is strongly advocated, via posterior predictive checks and… posterior predictive p-values, which are at best empirical indicators that something could be wrong, definitely not that everything’s allright! Chapter 7 is the model comparison equivalent of Chapter 6, starting with the predictive density (aka the evidence or the marginal likelihood), but completely bypassing the Bayes factor for information criteria like the Watanabe-Akaike or widely available information criterion (WAIC), and advocating cross-validation, which is empirically satisfying but formally hard to integrate within a full Bayesian perspective. Chapter 8 is about data collection, sample surveys, randomization and related topics, another entry that is missing from most Bayesian textbooks, maybe not that surprising given the research topics of some of the authors. And Chapter 9 is the symmetric in that it focus on the post-modelling step of decision making.

(Second part of the review to appear on Monday, leaving readers the weekend to recover!)

posterior predictive p-values

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , on February 4, 2014 by xi'an

Bayesian Data Analysis advocates in Chapter 6 using posterior predictive checks as a way of evaluating the fit of a potential model to the observed data. There is a no-nonsense feeling to it:

“If the model fits, then replicated data generated under the model should look similar to observed data. To put it another way, the observed data should look plausible under the posterior predictive distribution.”

And it aims at providing an answer to the frustrating (frustrating to me, at least) issue of Bayesian goodness-of-fit tests. There are however issues with the implementation, from deciding on which aspect of the data or of the model is to be examined, to the “use of the data twice” sin. Obviously, this is an exploratory tool with little decisional backup and it should be understood as a qualitative rather than quantitative assessment. As mentioned in my tutorial on Sunday (I wrote this post in Duke during O’Bayes 2013), it reminded me of Ratmann et al.’s ABCμ in that they both give reference distributions against which to calibrate the observed data. Most likely with a multidimensional representation. And the “use of the data twice” can be argued for or against, once a data-dependent loss function is built.

“One might worry about interpreting the significance levels of multiple tests or of tests chosen by inspection of the data (…) We do not make [a multiple test] adjustment, because we use predictive checks to see how particular aspects of the data would be expected to appear in replications. If we examine several test variables, we would not be surprised for some of them not to be fitted by the model-but if we are planning to apply the model, we might be interested in those aspects of the data that do not appear typical.”

The natural objection that having a multivariate measure of discrepancy runs into multiple testing is answered within the book with the reply that the idea is not to run formal tests. I still wonder how one should behave when faced with a vector of posterior predictive p-values (ppp).

pospredThe above picture is based on a normal mean/normal prior experiment I ran where the ratio prior-to-sampling variance increases from 100 to 10⁴. The ppp is based on the Bayes factor against a zero mean as a discrepancy. It thus grows away from zero very quickly and then levels up around 0.5, reaching only values close to 1 for very large values of x (i.e. never in practice). I find the graph interesting because if instead of the Bayes factor I use the marginal (numerator of the Bayes factor) then the picture is the exact opposite. Which, I presume, does not make a difference for Bayesian Data Analysis, since both extremes are considered as equally toxic… Still, still, still, we are is the same quandary as when using any kind of p-value: what is extreme? what is significant? Do we have again to select the dreaded 0.05?! To see how things are going, I then simulated the behaviour of the ppp under the “true” model for the pair (θ,x). And ended up with the histograms below:

truepospredwhich shows that under the true model the ppp does concentrate around .5 (surprisingly the range of ppp’s hardly exceeds .5 and I have no explanation for this). While the corresponding ppp does not necessarily pick any wrong model, discrepancies may be spotted by getting away from 0.5…

“The p-value is to the u-value as the posterior interval is to the confidence interval. Just as posterior intervals are not, in general, classical confidence intervals, Bayesian p-values are not generally u-values.”

Now, Bayesian Data Analysis also has this warning about ppp’s being not uniform under the true model (u-values), which is just as well considering the above example, but I cannot help wondering if the authors had intended a sort of subliminal message that they were not that far from uniform. And this brings back to the forefront the difficult interpretation of the numerical value of a ppp. That is, of its calibration. For evaluation of the fit of a model. Or for decision-making…

Gelman’s course in Paris, next term!

Posted in Books, Kids, Statistics, University life with tags , , , on August 2, 2013 by xi'an

Andrew Gelman will be visiting Paris-Dauphine and CREST next academic year, with support from those institutions as well as CNRS and Ville de Paris). Which is why he is learning how to pronounce Le loup est revenu. (Maybe not why, as this is not the most useful sentence in downtown Paris…) Very exciting news for all of us local Bayesians (or bayésiens). In addition, Andrew will teach from the latest edition of his book Bayesian Data Analysis, co-authored by John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin. He will actually start teaching mi-October, which means the book will not be out yet: so the students at Paris-Dauphine and ENSAE will get a true avant-première of Bayesian Data Analysis. Of course, this item of information will be sadistically tantalising to ‘Og’s readers who cannot spend the semester in Paris. For those who can, I presume there is a way to register for the course as auditeur libre at either Paris-Dauphine or ENSAE.

Note that the cover links with an earlier post of Aki on Andrew’s blog about the holiday effect. (Also mentioned earlier on the ‘Og…)

big Bayes stories

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , on July 29, 2013 by xi'an

(The following is our preface to the incoming “Big Bayes stories” special issue of Statistical Science, edited by Sharon McGrayne, Kerrie Mengersen and myself.)

Bayesian statistics is now endemic in many areas of scienti c, business and social research. Founded a quarter of a millenium ago, the enabling theory, models and computational tools have expanded exponentially in the past thirty years. So what is it that makes this approach so popular in practice? Now that Bayesian statistics has “grown up”, what has it got to show for it- self? In particular, what real-life problems has it really solved? A number of events motivated us to ask these questions: a conference in honour of Adrian Smith, one of the founders of modern Bayesian Statistics, which showcased a range of research emanating from his seminal work in the field, and the impressive book by Sharon McGrayne, the theory that would not die. At a café in Paris in 2011, we conceived the idea of gathering a similar collection of “Big Bayes stories”, that would demonstrate the appeal of adopting a Bayesian modelling approach in practice. That is, we wanted to collect real cases in which a Bayesian approach had made a significant di fference, either in addressing problems that could not be analysed otherwise, or in generating a new or deeper understanding of the data and the associated real-life problem.

After submitting this proposal to Jon Wellner, editor of Statistical Science, and obtaining his encouragement and support, we made a call for proposals. We received around 30 submissions (for which authors are to be warmly thanked!) and after a regular review process by both Bayesian and non-Bayesian referees (who are also deeply thanked), we ended up with 17 papers that reflected the type of stories we had hoped to hear. Sharon McGrayne, then read each paper with the utmost attention and provided helpful and encouraging comments on all. Sharon became part the editorial team in acknowledgement of this substantial editing contribution, which has made the stories much more enjoyable. In addition, referees who handled several submissions were asked to contribute discussions about the stories and some of them managed to fi nd additional time for this task, providing yet another perspective on the stories..

Bayesian Estimation of Population – Level Trends in Measures of Health Status Mariel M. Finucane, Christopher J. Paciorek, Goodarz Danaei, and Majid Ezzati
Galaxy Formation: Bayesian History Matching for the Observable Universe Ian Vernon, Michael Goldstein, and Richard G Bower
Estimating the Distribution of Dietary Consumption Patterns Raymond James Carroll
Bayesian Population Projections for the United Nations Adrian E. Raftery, Leontine Alkema, and Patrick Gerland
From Science to Management: Using Bayesian Networks to Learn about Lyngbya Sandra Johnson, Eva Abal, Kathleen Ahern, and Grant Hamilton
Search for the Wreckage of Air France Flight AF 447 Lawrence D Stone, Colleen M. Keller, Thomas M Kratzke, and Johan P Strumpfer
Finding the most distant quasars using Bayesian selection methods Daniel Mortlock
Estimation of HIV burden through Bayesian evidence synthesis Daniela De Angelis, Anne M Presanis, Stefano Conti, and A E Ades
Experiences in Bayesian Inference in Baltic Salmon Management Sakari Kuikka, Jarno Vanhatalo, Henni Pulkkinen, Samu Mäntyniemi, and Jukka Corander

As can be gathered from the table of contents, the spectrum of applications ranges across astronomy, epidemiology, ecology and demography, with the special case of the Air France wreckage story also reported in the paper- back edition of the theory that would not die. What made those cases so well suited for a Bayesian solution? In some situations, the prior or the expert opinion was crucial; in others, the complexity of the data model called for a hierarchical decomposition naturally provided in a Bayesian framework; and others involved many actors, perspectives and data sources that only Bayesian networks could aggregate. Now, before or (better) after reading those stories, one may wonder whether or not the “plus” brought by the Bayesian paradigm was truly significant. We think they did, at one level or another of the statistical analysis, while we acknowledge that in several cases other statistical perspectives or even other disciplines could have brought another solution, but presumably at a higher cost.

Now, before or (better) after reading those stories, one may wonder whether or not the \plus” brought by the Bayesian paradigm was truly signifi cant. We think it did, at one level or another of the statistical analysis, while we acknowledge that in several cases other statistical perspectives or even other disciplines could have provided another solution, but presumably at a higher cost. We think this collection of papers constitutes a worthy tribute to the maturity of the Bayesian paradigm, appropriate for commemorating the 250th anniversary of the publication of Bayes’ Essay towards solving a Problem in the Doctrine of Chances. We thus hope you will enjoy those stories, whether or not Bayesiana is your statistical republic.

the signal and the noise

Posted in Books, Statistics with tags , , , , , , , , , , , on February 27, 2013 by xi'an

It took me a while to get Nate Silver’s the signal and the noise: why so many predictions fall – but some don’t (hereafter s&n) and another while to read it (blame A Memory of Light!).

“Bayes and Price are telling Hume, don’t blame nature because you are too daft to understand it.” s&n, p.242

I find s&n highly interesting and it is rather refreshing to see the Bayesian approach so passionately promoted by a former poker player, as betting and Dutch book arguments have often been used as argument in favour of this approach. While it works well for some illustrations in the book, like poker and the stock market, as well as political polls and sports, I prefer more decision theoretic motivations for topics like weather prediction, sudden epidemics, global warming or terrorism. Of course, this passionate aspect makes s&n open to criticisms, like this one by Marcus and Davies in The New Yorker about seeing everything through the Bayesian lenses. The chapter on Bayes and Bayes’ theorem (Chapter 8) is a wee caricaturesque in this regard. Indeed, Silver sees too much in Bayes’ Essay, to the point of mistakenly attributing to Bayes a discussion of Hume’s sunrise problem. (The only remark is made in the Appendix, which was written by Price—like possibly the whole of the Essay!—, and  P.S. Laplace is the one who applied Bayesian reasoning to the problem, leading to Laplace’s succession rule.) The criticisms of frequentism are also slightly over-the-levee: they are mostly directed at inadequate models that a Bayesian analysis would similarly process in the wrong way. (Some critics argue on the opposite that Bayesian analysis is too much dependent on the model being “right”! Or on the availability of a fully-specified  model.) Seeing frequentism as restricted to “collecting data among just a sample of the population rather than the whole population” (p.252) is certainly not presenting a broad coverage of frequentism.

“Prediction serves a very central role in hypothesis testing, for instance, and therefore in all of science.” s&n, p.230

The book is written in a fairly enjoyable style, highly personal (no harm with that) and apart from superlativising (!) everyone making a relevant appearance—which seems the highest common denominator of all those pop’sci’ books I end up reviewing so very often!, maybe this is something like Rule #1 in Scientific Writing 101 courses: “makes the scientists sound real, turn’em into real people”—, I find it rather well-organised as it brings the reader from facts (prediction usually does poorly) to the possibility of higher quality prediction (by acknowledging prior information, accepting uncertainty, using all items of information available, further accepting uncertainty, &tc.). I am not sure the reader is the wiser by the end of the book on how one should improve one’s prediction tools, but there is a least a warning about the low quality of most predictions and predictive tools that should linger in the reader’s ears…. I enjoyed very much the chapter on chess, esp. the core about Kasparov’s misreading the computer reasons for a poor move (no further spoiler!), although I felt it was not much connected to the rest of the book.

In his review, Larry Wasserman argues that the defence Silver makes of his procedure is more frequentist than Bayesian. Because he uses calibration and long-term performances. Well… Having good calibration properties does not mean the procedure is not Bayesian or frequentist, simply that it is making efficient use of the available information. Anyway, I agree (!) with Larry on the point that Silver somehow “confuses “Bayesian inference” with “using Bayes’ theorem”. Or puts too much meaning in the use of Bayes’ theorem, not unlike the editors of Science & Vie a few months ago. To push Larry’s controversial statement a wee further, I would even wonder whether the book has anything to do about inference. Indeed, in the end, I find s&n rather uninformative about statistical modelling and even more (or less!) about model checking. The only “statistical” model that is truly discussed over the book is the power law distribution, applied to earthquakes and terrorist attack fatalities. This is not an helpful model in that (a) it does not explain anything, as it does not make use of covariates or side information, and (b) it has no predictive power, especially in the tails.  On the first point, concluding that Israel’s approach to counter-terrorism is successful because it “is the only country that has been able to bend” the power-law curve (p.442) sounds rather hasty. I’d like to see the same picture for Iraq, say. Actually, I found one in this arXiv paper. And it looks about the same for Afghanistan (Fig.4). On the second point, the modelling is poor in handling extreme values (which are the ones of interest in both cases) and cannot face change-points or lacks of stationary, an issue not sufficiently covered in s&n in my opinion. The difficulty with modelling volatile concepts like the stock market, the next presidential election or the move of your poker opponents is that there is no physical, immutable, law at play. Things can change from one instant to the next. Unpredictably. Esp. in the tails.

There are plenty of graphs in s&n, which is great, but not all of them are at the Tufte quality level. For instance, Figure 11-1 about the “average time U.S. common stock was held” contains six pie charts corresponding to six decades with the average time and a percentage which could be how long compared with the 1950s a stock was held. The graph is not mentioned in the text. (I will not mention Figure 8-2!) I also spotted a minuscule typo (`probabalistic’) on Figure 10-2A.

Maybe one last and highly personal remark about the chapter on poker (feel free to skip!): while I am a very poor card player, I do not mind playing cards (and loosing) with my kids. However, I simply do not understand the rationale of playing poker. If there is no money at stake, the game does not seem to make sense since every player can keep bluffing until the end of time. And if there is money at stake, I find the whole notion unethical. This is a zero sum game, so money comes from someone else’s pocket (or more likely someone else’s retirement plan or someone else’s kids college savings plan). Not much difference with the way the stock market behaves nowadays… (Incidentally, this chapter did not discuss at all the performances of computer poker programs, unexpectedly, as the number of possibilities is very small and they should thus be fairly efficient.)

the BUGS Book [guest post]

Posted in Books, R, Statistics with tags , , , , , , , , , , on February 25, 2013 by xi'an

(My colleague Jean-Louis Fouley, now at I3M, Montpellier, kindly agreed to write a review on the BUGS book for CHANCE. Here is the review, en avant-première! Watch out, it is fairly long and exhaustive! References will be available in the published version. The additions of book covers with BUGS in the title and of the corresponding Amazon links are mine!)

If a book has ever been so much desired in the world of statistics, it is for sure this one. Many people have been expecting it for more than 20 years ever since the WinBUGS software has been in use. Therefore, the tens of thousands of users of WinBUGS are indebted to the leading team of the BUGS project (D Lunn, C Jackson, N Best, A Thomas and D Spiegelhalter) for having eventually succeeded in finalizing the writing of this book and for making sure that the long-held expectations are not dashed.

As well explained in the Preface, the BUGS project initiated at Cambridge was a very ambitious one and at the forefront of the MCMC movement that revolutionized the development of Bayesian statistics in the early 90’s after the pioneering publication of Gelfand and Smith on Gibbs sampling.

This book comes out after several textbooks have already been published in the area of computational Bayesian statistics using BUGS and/or R (Gelman and Hill, 2007; Marin and Robert, 2007; Ntzoufras, 2009; Congdon, 2003, 2005, 2006, 2010; Kéry, 2010; Kéry and Schaub, 2011 and others). It is neither a theoretical book on foundations of Bayesian statistics (e.g. Bernardo and Smith, 1994; Robert, 2001) nor an academic textbook on Bayesian inference (Gelman et al, 2004, Carlin and Louis, 2008). Instead, it reflects very well the aims and spirit of the BUGS project and is meant to be a manual “for anyone who would like to apply Bayesian methods to real-world problems”.

In spite of its appearance, the book is not elementary. On the contrary, it addresses most of the critical issues faced by statisticians who want to apply Bayesian statistics in a clever and autonomous manner. Although very dense, its typical fluid British style of exposition based on real examples and simple arguments helps the reader to digest without too much pain such ingredients as regression and hierarchical models, model checking and comparison and all kinds of more sophisticated modelling approaches (spatial, mixture, time series, non linear with differential equations, non parametric, etc…).

The book consists of twelve chapters and three appendices specifically devoted to BUGS (A: syntax; B: functions and C: distributions) which are very helpful for practitioners. The book is illustrated with numerous examples. The exercises are well presented and explained, and the corresponding code is made available on a web site. Continue reading

Follow

Get every new post delivered to your Inbox.

Join 717 other followers