Archive for John Snow

The Effect [book review]

Posted in Books, R, Running, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , on March 10, 2023 by xi'an

While it sounds like the title of a science-fiction catastrophe novel or of a (of course) convoluted nouveau roman, this book by Nick Huntington-Klein is a massive initiation to econometrics and causality. As explained by the subtitle, An Introduction to Research Design and Causality.

This is a hüûüge book, actually made of two parts that could have been books (volumes?). And covering three langages, R, Stata, and Python, which should have led to three independent books. (Seriously, why print three versions when you need at best one?!)  I carried it with me during my vacations in Central Québec, but managed to loose my notes on the first part, which means missing the opportunity for biased quotes! It was mostly written during the COVID lockdown(s), which may explain for a certain amount of verbosity and rambling around.

“My mom loved the first part of the book and she is allergic to statistics.”

The first half (which is in fact a third!) is conceptual (and chatty) and almost formula free, based on the postulate that “it’s a pretty slim portion of students who understand a method because of an equation” (p.xxii). For this reader (or rather reviewer) and on explanations through example, it makes the reading much harder as spotting the main point gets harder (and requires reading most sentences!). And a very slow start since notations and mathematical notions have to be introduced with an excess of caution (as in the distinction between Latin and Greek symbols, p.36). Moving through single variable models, conditional distributions, with a lengthy explanation of how OLS are derived, data generating process and identification (of causes), causal diagrams, back and front doors (a recurrent notion within the book),  treatment effects and a conclusion chapter.

“Unlike statistical research, which is completely made of things that are at least slightly false, statistics itself is almost entirely true.” (p.327)

The second part, called the Toolbox, is closer to a classical introduction to econometrics, albeit with a shortage of mathematics (and no proof whatsoever), although [warning!] logarithms, polynomials, partial derivatives and matrices are used. Along with a consequent (3x) chunk allocated to printed codes, the density of the footnotes significantly increases in this section. It covers an extensive chapter on regression (including testing practice, non-linear and generalised linear models, as well as basic bootstrap without much warning about its use in… regression settings, and LASSO),  one on matching (with propensity scores, kernel weighting, Mahalanobis weighting, one on  simulation, yes simulation! in the sense of producing pseudo-data from known generating processes to check methods, as well as bootstrap (with resampling residuals making at last an appearance!), fixed and random effects (where the author “feels the presence of Andrew Gelman reaching through time and space to disagree”, p.405). The chapter on event studies is about time dependent data with a bit of ARIMA prediction (but nothing on non-stationary series and unit root issues). The more exotic chapters cover (18) difference-in-differences models (control vs treated groups, with John Snow pumping his way in), (19) instrumental variables (aka the minor bane of my 1980’s econometrics courses), with double least squares and generalised methods of moments (if not the simulated version), (20) discontinuity (i.e., changepoints), with the limitation of having a single variate explaining the change, rather than an unknown combination of them, and a rather pedestrian approach to the issue, (iv) other methods (including the first mention of machine learning regression/prediction and some causal forests), concluding with an “Under the rug” portmanteau.

Nothing (afaict) on multivariate regressed variates and simultaneous equations. Hardly an occurrence of Bayesian modelling (p.581), vague enough to remind me of my first course of statistics and the one-line annihilation of the notion.

Duh cover, but nice edition, except for the huge margins that could have been cut to reduce the 622 pages by a third (and harnessed the tendency of the author towards excessive footnotes!). And an unintentional white line on p.238! Cute and vaguely connected little drawings at the head of every chapter (like the head above). A rather terse matter index (except for the entry “The first reader to spot this wins ten bucks“!), which should have been completed with an acronym index.

“Calculus-heads will recognize all of this as taking integrals of the density curve. Did you know there’s calculus hidden inside statistics? The things your professor won’t tell you until it’s too late to drop the class.

Obviously I am biased in that I cannot negatively comment on an author running 5:37 a mile as, by now, I could just compete far from the 5:15 of yester decades! I am just a wee bit suspicious at the reported time, however, given that it happens exactly on page 537… (And I could have clearly taken issue with his 2014 paper, Is Robert anti-teacher? Or with the populist catering to anti-math attitudes as the above found in a footnote!) But I enjoyed reading the conceptual chapter on causality as well as the (more) technical chapter on instrumental variables (a notion I have consistently found confusing all the [long] way from graduate school). And while repeated references are made to Scott Cunningham’s Causal Inference: The Mixtape I think I will stop there with 500⁺ page introductory econometrics books!

[Disclaimer about potential self-plagiarism: this post or an edited version will potentially appear in my Books Review section in CHANCE.]

ravencry [book review]

Posted in Books, Kids, Travel with tags , , , , , , , , on November 2, 2019 by xi'an

After enjoying Ed McDonald’s Blackwing this summer, I ordered the second volume, Ravencry, which I read in a couple of days between Warwick and Edinburgh.

“Valya had marked all of the impact sites, then numbered them according to the night they had struck. The first night was more widely distributed, the second slightly more clustered. As the nights passed, the clusters drew together with fewer and fewer outliers.”

Since this is a sequel, the fantasy universe in which the story takes place has not changed much, but gains in consistence and depth. Especially the wastelands created by the wizard controlling the central character. The characters are mostly the same, with the same limited ethics for the surviving ones!, albeit with unexpected twists (no spoiler!), with the perils of a second volume, namely the sudden occurrence of a completely new and obviously deadly threat to the entire world, mostly avoided by connecting quite closely with the first volume. Even the arch-exploited theme of a new religious cult fits rather nicely the new plot. Despite of the urgency of the menace (as usual) to their world, the core characters do not do much in the first part of the book, engaged in a kind of detective work that is rather unusual for fantasy books, but the second part sees a lot of both action and explanation, which is why it became a page-turner for me. And while there are much less allusions to magical mathematics in this volume, a John Snow moment occurs near the above quote.

the Frankenstein chronicles

Posted in Statistics with tags , , , , , , , , , , , , , on March 31, 2019 by xi'an

Over a lazy weekend, I watched the TV series The Frankenstein Chronicles, which I found quite remarkable (if definitely Gothic and possibly too gory for some!). Connections with celebrities of (roughly) the time abound: While Mary Shelley makes an appearance in the first season of the series, not only as the writer of the famous novel (already famous in the novel as well) but also as a participant to a deadly experiment that would succeed in the novel (and eventually in the series), Charles Dickens is a constant witness to the unraveling of scary events as Boz the journalist, somewhat running after the facts, William Blake dies in one of the early episodes after painting a series of tarot like cards that eventually explains it all, Ada Lovelace works on the robotic dual of Frankenstein, Robert Peel creates the first police force (which will be called the Bobbies after him!), John Snow’s uncovering of the cholera source as the pump of Broad Street is reinvented with more nefarious reasons, and possibly others. Besides these historical landmarks (!), the story revolves around the corpse trafficking that fed medical schools and plots for many a novel. The (true) Anatomy Act is about to pass to regulate body supply for anatomical purposes and ensues a debate on the end of God that permeates mostly the first season and just a little bit the second season, which is more about State versus Church… The series is not without shortcomings, in particular a rather disconnected plot (which has the appeal of being unpredictable of jumping from one genre to the next) and a repeated proneness of the main character to being a scapegoat, but the reconstitution of London at the time is definitely impressive (although I cannot vouch for its authenticity!). Only the last episode of Season 2 feels a bit short when delivering, by too conveniently tying up all loose threads.

in the time of cholera

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , on April 6, 2014 by xi'an

%d bloggers like this: