straightforward statistics [book review]
“I took two different statistics courses as an undergraduate psychology major [and] four different advanced statistics classes as a PhD student.” G. Geher
Straightforward Statistics: Understanding the Tools of Research by Glenn Geher and Sara Hall is an introductory textbook for psychology and other social science students. (That Oxford University Press sent me for review in CHANCE. Nice cover, by the way!) I can spot the purpose behind the title, purpose heavily stressed anew in the preface and the first chapter, but it nonetheless irks me as conveying the message that one semester of reasonable diligence in class will suffice to any college students to “not only understanding research findings from psychology, but also to uncovering new truths about the world and our place in it” (p.9). Nothing less. While, in essence, it covers the basics found in all introductory textbooks, from descriptive statistics to ANOVA models. The inclusion of “real research examples” in the chapters of the book rather demonstrates how far from real research a reader of the book would stand…
“However, as often happen in life, we can be wrong…” (p.66)
The book aims at teaching basic statistics to “undergraduate students who are afraid of math” (p.xiii). By using “an accessible, accurate, coherent, and engaging presentation of statistics” (p.xiv). And reducing the maths expressions to a bare minimum. Unfortunately the very first formula (p.19) is meaningless (skipping the individual indices in sums is the rule throughout the book)
and the second one (Table 2.7, p.22 and again Tables 2.19 and 2.20, p.43)
is (a) missing both the indices and the summation symbol and (b) dividing the sum of “squared deviation scores” by N rather than the customary N-1. I also fail to see the point of providing histograms for categorical variables with only two modalities, like “Hungry” and “Not Hungry” (Fig. 2.11, p.47)…
“Statisticians never prove anything-thereby making prove something of a dirty word.” (p.116)
As I only teach math students, I cannot judge how adequate the textbook is for psychology or other social science students. It however sounds highly verbose to me, in its attempts to bypass maths formulas. For instance, the 15 pages of the chapter on standardised scores are about moving back and forth between the raw data and its standardised version
Or the two pages (pp.71-72) of motivations on the r coefficient before the (again meaningless) formula
which even skips indices of the z-scores to avoid frightening the students. (The book also asserts that a correllation of zero “corresponds to no mathematical relationship between [the] two variables whatsoever”, p.70.) Or yet the formula for (raw-score) regression (p.97) given as
without defining B. Which is apparently a typo as the standardised regression used β… I could keep going with such examples but the point I want to make is that, if the authors want to reach students that have fundamental problems with a formula like
which does not appear in the book, they could expose them to the analysis and understanding of the outcome of statistical software rather than spending a large part of the book on computing elementary quantities like the coefficients of a simple linear regression by hand. Instead, fundamental notions like multivariate regression is relegated to an appendix (E) as “Advanced statistics to be aware of”. Plus a two page discussion (pp.104-105) of a study conducted by the first author on predicting preference for vaginal sex. (To put things into context, the first author also wrote Mating Intelligence Unleashed. Which explains for some unusual “real research examples”.)
Another illustration of what I consider as the wrong focus of the book is provided by the introduction to (probability and) the normal distribution in Chapter 6, which dedicates most of the pages to reading the area under the normal density from a normal table without even providing the formula of this density. (And with an interesting typo in Figure 6.4.) Indeed, as in last century textbooks, the book does include probability tables for standard tests. Rather than relying on software and pocket calculators to move on to the probabilistic interpretation of p-values. And the multi-layered caution that is necessary when handling hypotheses labelled as significant. (A caution pushed to its paroxysm in The Cult of Significance I reviewed a while ago.) The book includes a chapter on power but, besides handling coordinate axes in a weird manner (check from Fig. 9.5 onwards) and repeating everything twice for left- and right-one-sided hypotheses!, it makes the computation of power appear like the main difficulty when it is its interpretation that is most delicate and fraught with danger. Were I to teach (classical) testing to math-adverse undergrads, and I may actually have to next year!, I would skip the technicalities and pile up cases and counter-cases explaining why p-values and power are not the end of the analysis. (Using Andrew’s blog as a good reservoir for such cases, as illustrated by his talk in Chamonix last January!) But I did not see any warning in that book on the dangers of manipulating data, formulating hypotheses to test out of the data, running multiple tests with no correction and so on.
To conclude on this extensive review, I, as an outsider, fail to see redeeming features that would single Straightforward Statistics: Understanding the Tools of Research as a particularly enticing textbook. The authors have clearly put a lot of efforts into their book, adopted what they think is the most appropriate tone to reach to the students, and added very detailed homeworks and their solution. Still, this view makes statistics sounds too straightforward and leads to the far too common apprehension of p-values as the ultimate assessment for statistical significance, without opening for alternatives such as outliers and model misspecification.
[Warning: this review has been published in a slightly edited version in CHANCE, Nov. 2014.]