## Bayes Rules! [book review]

Posted in Books, Kids, Mountains, pictures, R, Running, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on July 5, 2022 by xi'an

Bayes Rules! is a new introductory textbook on Applied Bayesian Model(l)ing, written by Alicia Johnson (Macalester College), Miles Ott (Johnson & Johnson), and Mine Dogucu (University of California Irvine). Textbook sent to me by CRC Press for review. It is available (free) online as a website and has a github site, as well as a bayesrule R package. (Which reminds me that both our own book R packages, bayess and mcsm, have gone obsolete on CRAN! And that I should find time to figure out the issue for an upgrading…)

As far as I can tell [from abroad and from only teaching students with a math background], Bayes Rules! seems to be catering to early (US) undergraduate students with very little exposure to mathematical statistics or probability, as it introduces basic probability notions like pmf, joint distribution, and Bayes’ theorem (as well as Greek letters!) and shies away from integration or algebra (a covariance matrix occurs on page 437 with a lot . For instance, the Normal-Normal conjugacy derivation is considered a “mouthful” (page 113). The exposition is somewhat stretched along the 500⁺ pages as a result, imho, which is presumably a feature shared with most textbooks at this level, and, accordingly, the exercises and quizzes are more about intuition and reproducing the contents of the chapter than technical. In fact, I did not spot there a mention of sufficiency, consistency, posterior concentration (almost made on page 113), improper priors, ergodicity, irreducibility, &tc., while other notions are not precisely defined, like ESS, weakly informative (page 234) or vague priors (page 77), prior information—which makes the negative answer to the quiz “All priors are informative”  (page 90) rather confusing—, R-hat, density plot, scaled likelihood, and more.

As an alternative to “technical derivations” Bayes Rules! centres on intuition and simulation (yay!) via its bayesrule R package. Itself relying on rstan. Learning from example (as R code is always provided), the book proceeds through conjugate priors, MCMC (Metropolis-Hasting) methods, regression models, and hierarchical regression models. Quite impressive given the limited prerequisites set by the authors. (I appreciated the representations of the prior-likelihood-posterior, especially in the sequential case.)

Regarding the “hot tip” (page 108) that the posterior mean always stands between the prior mean and the data mean, this should be made conditional on a conjugate setting and a mean parameterisation. Defining MCMC as a method that produces a sequence of realisations that are not from the target makes a point, except of course that there are settings where the realisations are from the target, for instance after a renewal event. Tuning MCMC should remain a partial mystery to readers after reading Chapter 7 as the Goldilocks principle is quite vague. Similarly, the derivation of the hyperparameters in a novel setting (not covered by the book) should prove a challenge, even though the readers are encouraged to “go forth and do some Bayes things” (page 509).

While Bayes factors are supported for some hypothesis testing (with no point null), model comparison follows more exploratory methods like X validation and expected log-predictive comparison.

The examples and exercises are diverse (if mostly US centric), modern (including cultural references that completely escape me), and often reflect on the authors’ societal concerns. In particular, their concern about a fair use of the inferred models is preminent, even though a quantitative assessment of the degree of fairness would require a much more advanced perspective than the book allows… (In that respect, Exercise 18.2 and the following ones are about book banning (in the US). Given the progressive tone of the book, and the recent ban of math textbooks in the US, I wonder if some conservative boards would consider banning it!) Concerning the Himalaya submitting running example (Chapters 18 & 19), where the probability to summit is conditional on the age of the climber and the use of additional oxygen, I am somewhat surprised that the altitude of the targeted peak is not included as a covariate. For instance, Ama Dablam (6848 m) is compared with Annapurna I (8091 m), which has the highest fatality-to-summit ratio (38%) of all. This should matter more than age: the Aosta guide Abele Blanc climbed Annapurna without oxygen at age 57! More to the point, the (practical) detailed examples do not bring unexpected conclusions, as for instance the fact that runners [thrice alas!] tend to slow down with age.

A geographical comment: Uluru (page 267) is not a city!, but an impressive sandstone monolith in the heart of Australia, a 5 hours drive away from Alice Springs. And historical mentions: Alan Turing (page 10) and the team at Bletchley Park indeed used Bayes factors (and sequential analysis) in cracking the Enigma, but this remained classified information for quite a while. Arianna Rosenbluth (page 10, but missing on page 165) was indeed a major contributor to Metropolis et al.  (1953, not cited), but would not qualify as a Bayesian statistician as the goal of their algorithm was a characterisation of the Boltzman (or Gibbs) distribution, not statistical inference. And David Blackwell’s (page 10) Basic Statistics is possibly the earliest instance of an introductory Bayesian and decision-theory textbook, but it never mentions Bayes or Bayesianism.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]

## a journal of the plague year [October reviews]

Posted in Books, Kids, Mountains, pictures, Travel with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on October 31, 2020 by xi'an

Read two more “little red” books from Éditions Guérin/Paulsen, the fantastic Chamonix editor, namely, Lénine à Chamonix by François Garde, a former Secretary-General of the Government of New-Caledonia, and Les Hallucinés (Un voyage dans les délires d’altitude), by Thomas Venin. The first book is a collection of short stories related to mountains, ranging from the realistic to the fantastic, and from good to terrible. I think in particular of the 1447 mètres story that involves a Holtanna like big wall in Iceland [good start then!], possibly the Latrabjarg cliff—although it stands at 1447 feet, not meters!, and the absurd impact of prime numbers on the failure of the climbing team. Lénine à Chamonix muses on the supposed day Vladimir Illitch “Lenin” Ulyanov spent in Chamonix in 1903, almost losing his life but adopting his alias there [which clashes with its 1902 first occurrence in publications!]. The second book is about high altitude hallucinations as told by survivors from the “death zone”. Induced by hypoxia, they lead hymalayists to see imaginary things or persons, sometimes to act against their own interest and often to die as a result. The stories are about those who survived and told about their visions. They reminded me of Abele Blanc telling us of facing the simultaneous hallucinations of two (!) partners during an attempt at Annapurna and managing to bring down one of the climbers, with the other managing on its own after a minor fall resetting his brain to the real world. Touching the limits of human abilities and the mysterious working of the brain…

Cooked several dishes suggested by the New York Times (!), including a spinach risotto [good], orecchiette with fennel and sausages [great], and malai broccoli [not so great], as well as by the Guardian’s Yotam Ottolenghi’s recipes, like a yummy spinash-potatoe pie. As Fall is seeping in, went back to old classics like red cabbage Flemish style. And butternut soups, starting with our own. And a pumpkin biryani!

Read Peter Hamilton’s Salvation, with a certain reluctance to proceed as I found the stories within mostly disconnected and of limited interest. (This came obviously as a disappointment, having enjoyed a lot Great North Road.) Unlikely I read the following volumes in the series. On the side, I heard that fantasy writer Terry Goodkind died on Sept. 17. He had written “The Sword of Truth” series, of which I read the first three volumes. (Out of 21 total!!!) While there were some qualities in the story, the setting was quite naïve (in the usual trope of an evil powerful character that need be fought at all costs) and the books carry a strong component of political conservatism as well as extensive sections of sadistic scenes

Watched Tim Burton’s 2012 Dark Shadows (terrible!) and a Taiwanese 2018 dark comedy entitled Dear Ex (誰先愛上他的) which I found rather interesting and quite original, despite the overdone antics of the mother. I even tried Tim Burton’s Sweeney Todd for a few minutes, being completely unaware this was a musical!

## first 8000

Posted in Mountains with tags , , , , , , , , , , , , , on June 3, 2020 by xi'an

## Ueli Steck dies on Nupse [Ueli Steck tödlich verunglückt]

Posted in Books, Mountains, Running with tags , , , , , , , on April 30, 2017 by xi'an

Ueli Steck was a Swiss climber renowned for breaking speed records on the hardest routes of the Alps. Including the legendary Eigerwand. And having been evacuated under death threats from the Everest base camp two years ago. I have been following on Instagram his preparation for another speed attempt at Everest the past weeks and it is a hug shock to learn he fell to his death on Nupse yesterday. Total respect to this immense Extrembergsteiger, who has now joined the sad cenacle of top climbers who did not make it back…

## optimal Bernoulli factory

Posted in Statistics with tags , , , , , , , , , , on January 17, 2017 by xi'an

One of the last arXivals of the year was this paper by Luis Mendo on an optimal algorithm for Bernoulli factory (or Lovàsz‘s or yet Basu‘s) problems, i.e., for producing an unbiased estimate of f(p), 0<p<1, from an unrestricted number of Bernoulli trials with probability p of heads. (See, e.g., Mark Huber’s recent book for background.) This paper drove me to read an older 1999 unpublished document by Wästlund, unpublished because of the overlap with Keane and O’Brien (1994). One interesting gem in this document is that Wästlund produces a Bernoulli factory for the function f(p)=√p, which is not of considerable interest per se, but which was proposed to me as a puzzle by Professor Sinha during my visit to the Department of Statistics at the University of Calcutta. Based on his 1979 paper with P.K. Banerjee. The algorithm is based on a stopping rule N: throw a fair coin until the number of heads n+1 is greater than the number of tails n. The event N=2n+1 occurs with probability

${2n \choose n} \big/ 2^{2n+1}$

[Using a biased coin with probability p to simulate a fair coin is straightforward.] Then flip the original coin n+1 times and produce a result of 1 if at least one toss gives heads. This happens with probability √p.

Mendo generalises Wästlund‘s algorithm to functions expressed as a power series in (1-p)

$f(p)=1-\sum_{i=1}^\infty c_i(1-p)^i$

with the sum of the weights being equal to one. This means proceeding through Bernoulli B(p) generations until one realisation is one or a probability

$c_i\big/1-\sum_{j=1}^{i-1}c_j$

event occurs [which can be derived from a Bernoulli B(p) sequence]. Furthermore, this version achieves asymptotic optimality in the number of tosses, thanks to a form of Cramer-Rao lower bound. (Which makes yet another connection with Kolkata!)