## Archive for introductory textbooks

## anti-séche

Posted in Kids, pictures, University life with tags All of Statistics, central limit theorem, introductory textbooks, t-test, Université Paris Dauphine on December 21, 2014 by xi'an## Bayes’ Rule [book review]

Posted in Books, Statistics, University life with tags Amazon, Bayes formula, Bayes rule, Bayes theorem, Bayesian Analysis, England, introductory textbooks, publishing, short course, Thomas Bayes' portrait, tutorial on July 10, 2014 by xi'an**T**his introduction to Bayesian Analysis, Bayes’ Rule, was written by James Stone from the University of Sheffield, who contacted CHANCE suggesting a review of his book. I thus bought it from amazon to check the contents. And write a review.

**F**irst, the format of the book. It is a short paper of 127 pages, plus 40 pages of glossary, appendices, references and index. I eventually found the name of the publisher, Sebtel Press, but for a while thought the book was self-produced. While the LaTeX output is fine and the (Matlab) graphs readable, pictures are not of the best quality and the display editing is minimal in that there are several huge white spaces between pages. Nothing major there, obviously, it simply makes the book look like course notes, but this is in no way detrimental to its potential appeal. (I will not comment on the numerous appearances of Bayes’ alleged portrait in the book.)

“… (on average) the adjusted value θ^{MAP}is more accurate than θ^{MLE}.” (p.82)

Bayes’ Rule has the interesting feature that, in the very first chapter, after spending a rather long time on Bayes’ formula, it introduces Bayes factors (p.15). With the somewhat confusing choice of calling the *prior* probabilities of hypotheses *marginal* probabilities. Even though they are indeed *marginal* given the joint, *marginal* is usually reserved for the sample, as in *marginal* likelihood. Before returning to more (binary) applications of Bayes’ formula for the rest of the chapter. The second chapter is about probability theory, which means here introducing the three axioms of probability and discussing geometric interpretations of those axioms and Bayes’ rule. Chapter 3 moves to the case of discrete random variables with more than two values, i.e. contingency tables, on which the range of probability distributions is (re-)defined and produces a new entry to Bayes’ rule. And to the MAP. Given this pattern, it is not surprising that Chapter 4 does the same for continuous parameters. The parameter of a coin flip. This allows for discussion of uniform and reference priors. Including maximum entropy priors à la Jaynes. And bootstrap samples presented as approximating the posterior distribution under the “fairest prior”. And even two pages on standard loss functions. This chapter is followed by a short chapter dedicated to estimating a normal mean, then another short one on exploring the notion of a continuous joint (Gaussian) density.

“To some people the wordBayesianis like a red rag to a bull.” (p.119)

Bayes’ Rule concludes with a chapter entitled *Bayesian wars*. A rather surprising choice, given the intended audience. Which is rather bound to confuse this audience… The first part is about probabilistic ways of representing information, leading to subjective probability. The discussion goes on for a few pages to justify the use of priors but I find completely unfair the argument that because Bayes’ rule is a mathematical theorem, it “has been proven to be true”. It is indeed a maths theorem, however that does not imply that any inference based on this theorem is correct! (A surprising parallel is Kadane’s Principles of Uncertainty with its anti-objective final chapter.)

**A**ll in all, I remain puzzled after reading Bayes’ Rule. Puzzled by the intended audience, as contrary to other books I recently reviewed, the author does not shy away from mathematical notations and concepts, even though he proceeds quite gently through the basics of probability. Therefore, potential readers need some modicum of mathematical background that some students may miss (although it actually corresponds to what my kids would have learned in high school). It could thus constitute a soft entry to Bayesian concepts, before taking a formal course on Bayesian analysis. Hence doing no harm to the perception of the field.

## straightforward statistics [book review]

Posted in Books, Kids, Statistics, University life with tags hypothesis testing, introductory textbooks, multiple tests, Oxford University Press, p-values, power, psychology, tests on July 3, 2014 by xi'an

“I took two different statistics courses as an undergraduate psychology major [and] four different advanced statistics classes as a PhD student.”G. Geher

*Straightforward Statistics: Understanding the Tools of Research* by Glenn Geher and Sara Hall is an introductory textbook for psychology and other social science students. (That Oxford University Press sent me for review in CHANCE. Nice cover, by the way!) I can spot the purpose behind the title, purpose heavily stressed anew in the preface and the first chapter, but it nonetheless irks me as conveying the message that one semester of reasonable diligence in class will suffice to any college students to *“not only understanding research findings from psychology, but also to uncovering new truths about the world and our place in it”* (p.9). Nothing less. While, in essence, it covers the basics found in all introductory textbooks, from descriptive statistics to ANOVA models. The inclusion of “real research examples” in the chapters of the book rather demonstrates how far from real research a reader of the book would stand… Continue reading

## Statistical modeling and computation [apologies]

Posted in Books, R, Statistics, University life with tags apologies, Australia, Bayesian statistics, Dirk Kroese, introductory textbooks, Joshua Chan, Monte Carlo methods, Monte Carlo Statistical Methods, R, state space model, Statistical Modeling, typo on June 11, 2014 by xi'an**I**n my book review of the recent book by Dirk Kroese and Joshua Chan, *Statistical Modeling and Computation*, I mistakenly and persistently typed the name of the second author as Joshua Chen. This typo alas made it to the printed and on-line versions of the subsequent CHANCE **27**(2) column. I am thus very much sorry for this mistake of mine and most sincerely apologise to the authors. Indeed, it always annoys me to have my name mistyped (usually as Roberts!) in references. *[If nothing else, this typo signals it is high time for a change of my prescription glasses.]*

## Statistical modeling and computation [book review]

Posted in Books, R, Statistics, University life with tags ANU, Australia, Bayesian Essentials with R, Bayesian statistics, Brisbane, Dirk Kroese, introductory textbooks, Joshua Chan, Matlab, maximum likelihood estimation, Monte Carlo methods, Monte Carlo Statistical Methods, R, state space model on January 22, 2014 by xi'an**D**irk Kroese (from UQ, Brisbane) and Joshua Chan (from ANU, Canberra) just published a book entitled *Statistical Modeling and Computation*, distributed by Springer-Verlag (I cannot tell which series it is part of from the cover or frontpages…) The book is intended mostly for an undergrad audience (or for graduate students with no probability or statistics background). Given that prerequisite, *Statistical Modeling and Computation* is fairly standard in that it recalls probability basics, the principles of statistical inference, and classical parametric models. In a third part, the authors cover “advanced models” like generalised linear models, time series and state-space models. The specificity of the book lies in the inclusion of simulation methods, in particular MCMC methods, and illustrations by Matlab code boxes. (Codes that are available on the companion website, along with R translations.) It thus has a lot in common with our *Bayesian Essentials with R*, meaning that I am not the most appropriate or least ~~un~~biased reviewer for this book. Continue reading

## the cartoon introduction to statistics

Posted in Books, Kids, Statistics, University life with tags book review, cartoon, CHANCE, introductory textbooks, Statistics, textbooks on May 16, 2013 by xi'an**A** few weeks ago, I received a copy of The Cartoon Introduction to Statistics by Grady Klein and Alan Dabney, send by their publisher, Farrar, Staus and Giroux from New York City. (Never heard of this publisher previously, but I must admit the aggregation of those three names sounds great!) As this was an unpublished version of the book, to appear in July 2013, I first assumed my copy was a draft version, with black and white drawings using limited precision graphics.. However, when checking the already published Cartoon Introduction to Economics, I realised this was the style of Grady Klein (as reflected below).

**T**hus, I have to assume this is how The Cartoon Introduction to Statistics will look like when published in July… Actually, I received later a second copy of the definitive version, so I can guarantee this is the case. (Funny enough, there is a supportive quote of the author of Naked Statistics on the back-cover!) I am quite perplexed by the whole project. First, I do not see how a newcomer to the field can learn better from a cartoon with an average four sentences per page than from a regular introductory textbook. Cartoons introduce an element of fun into the explanation, with jokes and (irrelevant) side stories, but they are also distracting as readers are not always in a position to know what matters and what does not. Second, as the drawings are done in a rough style, I find this increases the potential for confusion. For instance, the above cover reproduces an example linking the histogram of a sample of averages and the normal distribution. If a reader has never heard of histograms, I do not see how he or she could gather how they are constructed in practice. The width of the bags is related to the number of persons in each bag (50 random Americans) in the story, while it should be related to the inverse of the square root of this number in the theory. Similarly, I find the explanation about confidence intervals lacking: when trying to reassure the readers about the fact that any given random sample from a population might be misleading, the authors state that “in the long run most cans [of worms] have averages in the clump under the hump [of the normal pdf]”. This is not reassuring at all: when using confidence intervals based on 10 or on 10⁵ normal observations, the corresponding 95% confidence intervals on their mean both have 95% chances to contain the true mean. The long run aspect refers to the repeated use of those intervals. (I am not even mentioning the classical fallacy of stating that “we are 99.7% confident that the population average is somewhere between -1.73 and -0.27″…)

**I**n conclusion, I remember buying an illustrated entry to Marx’ Das Kapital when I started economics in graduate school (as a minor). This gave me a very quick idea of the purpose of the book. However, I read through the whole book to understand (or try to understand) Marx’ analysis of the economy. And the introduction did not help much in this regard. In the present setting, we are dealing with statistics, not economics, not philosophy. Having read a cartoon about the average length of worms within a can of worms is not going to help much in understanding the Central Limit Theorem and the subsequent derivation of confidence intervals. The validation of statistical methods is done through mathematics, which provides a formal language cartoons cannot reproduce.

## a brief on naked statistics

Posted in Books, R, Statistics, University life with tags book review, general public, How to Lie with Statistics, India, introductory textbooks, masala chai, Naked Economics, Naked Statistics, Zen, Zeno's paradox on April 3, 2013 by xi'an**O**ver the last Sunday breakfast I went through *Naked Statistics: Stripping the Dread from the Data*. The first two pages managed to put me in a prejudiced mood for the rest of the book. To wit: the author starts with some math bashing (like, no one ever bothers to tell us about the uses of high school calculus!) either because he really feels like this or because it pays with the intended audience (like, we are on the same side, pal!), he then shows how he outsmarted his high school math teacher by spotting the exam was not possibly designed for his class and then another math teacher by just… re-inventing the steps leading to Zeno’s paradox (said Zeno of Elea not appearing in the credits of the book, to be sure) and sums it up with an NRA argument: *“statistics is like a high-caliber weapon: helpful when used correctly”* (p.xiv). Add to that a highly ethnocentric perspective that makes the book hardly readable for anyone outside the US, due to its absolute focus on all things American (exaggerating just a wee bit: *who are Lebron James, Kim Kardashian, and Dan Rather?! what is Netflix?! why’s this Donald Rumsfeld guy quoted throughout the book?! how do they play baseball?! What do NBA, NHL, and SAT stand for?!* *&tc.*)—as best illustrated by the facts that it took Charles Wheelan three months to realise a (golf) laser measuring instrument he had received could be in another unit that *feet*, namely *meters*!, and that he considers paying 100 rupees for a chai (मसाला चाय) in India a cheap price when this amount roughly corresponds to the average daily salary there…—. Top the whole thing with the fact that the author has already written a *Naked Economics* and seemingly found gold. (I am desperate for the incoming *Naked Paleopathology* tome in the series!) And there you get me stuck with such a highly negative *a priori* about *Naked Statistics* that I could not shake it off for the rest of the book.

“This book will not make you a statistical expert (…) This book is not a textbook.”(p.xv)

**W**ith this warning in mind about my bias, let’s get on with what’s in this book. The above tells us what isn’t. To quote further from the author, the book “*has been designed to introduce the statistical concepts with the most relevance to everyday life*“ (p.xv). *Naked Statistics* goes over the basic notions of statistics (mean, standard deviation, correlation, linear regression, testing, design, polling), gives a sprinkle of probability background (counting models and the central limit theorem, which Wheelan considers as part of statistics), and spend the remaining chapters warning the reader(s) about the possible missuses of models and statistical tools if implemented in the wrong situations or with the wrong type of data. (There are a few graphs, but they are not particularly inspiring.) All this done with the minimum amount of maths formulae, mostly hidden in footnotes and appendices. (But then why adding an extra formula for σ when one is given just before for σ²?!) Sometimes, the minimum is not enough, as demonstrated by the “formula for calculating the correlation coefficient” (p.61) which takes a whole page of text to get around this absurdity of not using maths symbols like Σ and concludes with the lame *“I’ll wave my hands and let the computer do the work”* (p.61)! Somehow surprisingly, given the low-key nature of the book, it includes a final appendix on statistical software. From Excel, to SAS, Stata, and …R! While I am pleased at this inclusion, it sounds very much orthogonal to the purpose and the intended audience of *Naked Statistics.* I cannot fathom anyone reading the book and then immediately embarking upon writing an R code without stopping by a statistics textbook or formal training. (Incidentally, the author reproduces the usual confusion between free and open source, p.259.) Continue reading