Archive for CHANCE

Casanova’s Lottery [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on January 12, 2023 by xi'an

This “history of a revolutionary game of chance” is the latest book by Stephen Stigler and is indeed of an historical nature, following the French Lottery from its inception as Loterie royale in 1758 to the Loterie Nationale in 1836 (with the intermediate names of Loterie de France, Loterie Nationale, Loterie impériale, Loterie royale reflecting the agitated history of the turn of that Century!).

The incentive for following this State lottery is that it is exceptional by its mathematical foundations. Contrary to other lotteries of the time, it was indeed grounded on the averaging of losses and gains on the long run (for the State). The French (Royal) State thus accepted the possibility of huge losses at some draws since they would be compensated by even larger gains. The reasoning proved most correct since the Loterie went providing as far as 4% of the overall State budget, despite the running costs of maintaining a network of betting places and employees, who had to be mathematically savy in order to compute the exact gains of the winners.This is rather amazing as the understanding of the Law of Large Numbers was quite fresh (on an historical scale) thanks to the considerable advances made by Pascal, Fermat, (Jakob) Bernoulli and a few others. (The book mentions the Encyclopedist and mathematician Jean d’Alembert as being present at the meeting that decided of the creation of the Loterie in 1757.)

One may wonder why Casanova gets the credit for this lottery. In true agreement with Stigler’s Law, it is directly connected with the Genoan lottery and subsequent avatars in some Italian cities, including Casanova’s Venezia. But jack-of-all-trades Casanova was instrumental in selling the notion to the French State, having landed in Paris after a daring flight from the Serenissima’s jails. After succeeding in convincing the King’s officers to launch the scheme crafted by a certain Ranieri (de’) Calzabig—not to be confused with the much maligned Salieri!—who would later collaborate with Gluck on Orfeo ed Eurydice and Alceste, Casanova received a salary from the Loterie administration and further run several betting offices. Until he left Paris for further adventures! Including an attempt to reproduce the lottery in Berlin, where Frederick II proved less receptive than Louis XIV. (Possibly due to Euler’s cautionary advice.) The final sentence of the book stands by its title: “It was indeed Casanova’s lottery” (p.210).

Unsurprisingly, given Stephen’s fascination for Pierre-Simon Laplace, the great man plays a role in the history, first by writing in 1774 one of his earliest papers on a lottery problem, namely the distribution of the number of draws needed for all 90 numbers to appear. His (correct) solution is an alternating sum whose derivation proved a numerical challenge. Thirty years later, Laplace came up with a good and manageable approximation (see Appendix Two). Laplace also contributed to the end of the Loterie by arguing on moral grounds against this “voluntary” tax, along Talleyrand, a fellow in perpetually adapting to the changing political regimes. It is a bit of a surprise to read that this rather profitable venture ended up in 1836, more under bankers’ than moralists´ pressure. (A new national lottery—based on printed tickets rather than bets on results—was created a century later, in 1933 and survived the second World War, with the French Loto appearing in 1974 as a direct successor to Casanova’s lottery.)

The book covers many fascinating aspects, from the daily run of the Loterie, to the various measures (successfully) taken against fraud, to the survival during the Révolution and its extension through (the Napoleonic) Empire, to tests for fairness thanks to numerous data from almanacs, to the behaviour of bettors and the sale of “helping” books. to (Daniel) Bernoulli, Buffon, Condorcet, and Laplace modelling rewards and supporting decreasing marginal utility. Note that there are hardly any mathematical formula, except for an appendix on the probabilities of wins and the returns, as well as Laplace’s (and Legendre’s) derivations. Which makes the book eminently suited for a large audience, the more thanks to Stephen Stigler’s perfect style.

This (paperback) book is also very pleasantly designed by the University of Chicago Press, with a plesant font (Adobe Calson Pro) and a very nice cover involving Laplace undercover, taken from a painting owned by the author. The many reproductions of epoch documents are well-done and easily readable. And, needless to say given the scholarship of Stephen, the reference list is impressive.

The book is testament to the remarkable skills of Stephen who searched for material over thirty years, from Parisian specialised booksellers to French, English, and American archives. He manages to bring into the story a wealth of connections and characters, as for instance Voltaire’s scheme to take advantage of an earlier French State lottery aimed at reimbursing State debtors. (Voltaire actually made a fortune of several millions francs out of this poorly designed lottery.) For my personal instructions, the book also put life to several Métro stations like Pereire and Duverney. But the book‘s contents will prove fascinating way beyond Parisian locals and francophiles. Enjoy!

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE. As appropriate for a book about capitalising on chance beliefs!]

Bayes Factors for Forensic Decision Analyses with R [book review]

Posted in Books, R, Statistics with tags , , , , , , , , , , , , , on November 28, 2022 by xi'an

My friend EJ Wagenmaker pointed me towards an entire book on the BF by Bozza (from Ca’Foscari, Venezia), Taroni and Biederman. It is providing a sort of blueprint for using Bayes factors in forensics for both investigative and evaluative purposes. With R code and free access. I am of course unable to judge of the relevance of the approach for forensic science (I was under the impression that Bayesian arguments were usually not well-received in the courtroom) but find that overall the approach is rather one of repositioning the standard Bayesian tools within a forensic framework.

“The [evaluative] purpose is to assign a value to the result of a comparison between an item of unknown source and an item from a known source.”

And thus I found nothing shocking or striking from this standard presentation of Bayes factors, including the call to loss functions, if a bit overly expansive in its exposition. The style is also classical, with a choice of grey background vignettes for R coding parts that we also picked in our R books! If anything, I would have expected more realistic discussions and illustrations of prior specification across the hypotheses (see e.g. page 34), while the authors are mostly centering on conjugate priors and the (de Finetti) trick of the equivalent prior sample size. Bayes factors are mostly assessed using a conservative version of Jeffreys’ “scale of evidence”. The computational section of the book introduces MCMC (briefly) and mentions importance sampling, harmonic mean (with a minimalist warning), and Chib’s formula (with no warning whatsoever).

“The [investigative] purpose is to provide information in investigative proceedings (…) The scientist (…) uses the findings to generate hypotheses and suggestions for explanations of observations, in order to give guidance to investigators or litigants.”

Chapter 2 is about standard models: inferring about a proportion, with some Monte Carlo illustration,  and the complication of background elements, normal mean, with an improper prior making an appearance [on p.69] with no mention being made of the general prohibition of such generalised priors when using Bayes factors or even of the Lindley-Jeffreys paradox. Again, the main difference with Bayesian textbooks stands with the chosen examples.

Chapter 3 focus on evidence evaluation [not in the computational sense] but, again, the coverage is about standard models: processing the Binomial, multinomial, Poisson models, again though conjugates. (With the side remark that Fig 3.2 is rather unhelpful: when moving the prior probability of the null from zero to one, its posterior probability also moves from zero to one!) We are back to the Normal mean case with the model variance being known then unknown. (An unintentionally funny remark (p.96) about the dependence between mean and variance being seen as too restrictive and replaced with… independence!). At last (for me!), the book is pointing [p.99] out that the BF is highly sensitive to the choice of the prior variance (Lindley-Jeffreys, where art thou?!), but with a return of the improper prior (on said variance, p.102) with no debate on the ensuing validity of the BF. Multivariate Normals are also presented, with Wishart priors on the precision matrix, and more details about Chib’s estimate of the evidence. This chapter also contains illustrations of the so-called score-based BF which is simply (?) a Bayes factor using a distribution on a distance summary (between an hypothetical population and the data) and an approximation of the distributions of these summaries, provided enough data is available… I also spotted a potentially interesting foray into BF variability (Section 3.4.2), although not reaching all the way to a notion of BF posterior distributions.

Chapter 4 stands for Bayes factors for investigation, where alternative(s) is(are) less specified, as testing eg Basmati rice vs non-Basmati rice. But there is no non-parametric alternative considered in the book. Otherwise, it looks to me rather similar to Chapter 3, i.e. being back to binomial, multinomial models, with more discussions onm prior specification, more normal, or non-normal model, where the prior distribution is puzzingly estimated by a kernel density estimator, a portmanteau alternative (p.157), more multivariate Normals with Wishart priors and an entry on classification & discrimination.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE. As appropriate for a book about Chance!]

Bayes Rules! [book review]

Posted in Books, Kids, Mountains, pictures, R, Running, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on July 5, 2022 by xi'an

Bayes Rules! is a new introductory textbook on Applied Bayesian Model(l)ing, written by Alicia Johnson (Macalester College), Miles Ott (Johnson & Johnson), and Mine Dogucu (University of California Irvine). Textbook sent to me by CRC Press for review. It is available (free) online as a website and has a github site, as well as a bayesrule R package. (Which reminds me that both our own book R packages, bayess and mcsm, have gone obsolete on CRAN! And that I should find time to figure out the issue for an upgrading…)

As far as I can tell [from abroad and from only teaching students with a math background], Bayes Rules! seems to be catering to early (US) undergraduate students with very little exposure to mathematical statistics or probability, as it introduces basic probability notions like pmf, joint distribution, and Bayes’ theorem (as well as Greek letters!) and shies away from integration or algebra (a covariance matrix occurs on page 437 with a lot . For instance, the Normal-Normal conjugacy derivation is considered a “mouthful” (page 113). The exposition is somewhat stretched along the 500⁺ pages as a result, imho, which is presumably a feature shared with most textbooks at this level, and, accordingly, the exercises and quizzes are more about intuition and reproducing the contents of the chapter than technical. In fact, I did not spot there a mention of sufficiency, consistency, posterior concentration (almost made on page 113), improper priors, ergodicity, irreducibility, &tc., while other notions are not precisely defined, like ESS, weakly informative (page 234) or vague priors (page 77), prior information—which makes the negative answer to the quiz “All priors are informative”  (page 90) rather confusing—, R-hat, density plot, scaled likelihood, and more.

As an alternative to “technical derivations” Bayes Rules! centres on intuition and simulation (yay!) via its bayesrule R package. Itself relying on rstan. Learning from example (as R code is always provided), the book proceeds through conjugate priors, MCMC (Metropolis-Hasting) methods, regression models, and hierarchical regression models. Quite impressive given the limited prerequisites set by the authors. (I appreciated the representations of the prior-likelihood-posterior, especially in the sequential case.)

Regarding the “hot tip” (page 108) that the posterior mean always stands between the prior mean and the data mean, this should be made conditional on a conjugate setting and a mean parameterisation. Defining MCMC as a method that produces a sequence of realisations that are not from the target makes a point, except of course that there are settings where the realisations are from the target, for instance after a renewal event. Tuning MCMC should remain a partial mystery to readers after reading Chapter 7 as the Goldilocks principle is quite vague. Similarly, the derivation of the hyperparameters in a novel setting (not covered by the book) should prove a challenge, even though the readers are encouraged to “go forth and do some Bayes things” (page 509).

While Bayes factors are supported for some hypothesis testing (with no point null), model comparison follows more exploratory methods like X validation and expected log-predictive comparison.

The examples and exercises are diverse (if mostly US centric), modern (including cultural references that completely escape me), and often reflect on the authors’ societal concerns. In particular, their concern about a fair use of the inferred models is preminent, even though a quantitative assessment of the degree of fairness would require a much more advanced perspective than the book allows… (In that respect, Exercise 18.2 and the following ones are about book banning (in the US). Given the progressive tone of the book, and the recent ban of math textbooks in the US, I wonder if some conservative boards would consider banning it!) Concerning the Himalaya submitting running example (Chapters 18 & 19), where the probability to summit is conditional on the age of the climber and the use of additional oxygen, I am somewhat surprised that the altitude of the targeted peak is not included as a covariate. For instance, Ama Dablam (6848 m) is compared with Annapurna I (8091 m), which has the highest fatality-to-summit ratio (38%) of all. This should matter more than age: the Aosta guide Abele Blanc climbed Annapurna without oxygen at age 57! More to the point, the (practical) detailed examples do not bring unexpected conclusions, as for instance the fact that runners [thrice alas!] tend to slow down with age.

A geographical comment: Uluru (page 267) is not a city!, but an impressive sandstone monolith in the heart of Australia, a 5 hours drive away from Alice Springs. And historical mentions: Alan Turing (page 10) and the team at Bletchley Park indeed used Bayes factors (and sequential analysis) in cracking the Enigma, but this remained classified information for quite a while. Arianna Rosenbluth (page 10, but missing on page 165) was indeed a major contributor to Metropolis et al.  (1953, not cited), but would not qualify as a Bayesian statistician as the goal of their algorithm was a characterisation of the Boltzman (or Gibbs) distribution, not statistical inference. And David Blackwell’s (page 10) Basic Statistics is possibly the earliest instance of an introductory Bayesian and decision-theory textbook, but it never mentions Bayes or Bayesianism.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]

What are the chances of that?

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , on May 13, 2022 by xi'an

What are the chances that I review a book with this title, a few months after reviewing a book called What is luck?! This one is written by Andrew Elliott, whose Is that a big number? I reviewed a wee bit earlier… And that the cover of this book involves a particularly unlucky sequence of die as in my much earlier review of Krysz Burdzy’s book? (About 10⁻⁶ less likely than the likeliest draw!)

The (relative) specificity of this book is to try to convey the notions of chance and uncertainty to the general public, more in demonstrating that our intuition is most often wrong by examples and simulations, than in delving into psychological reasons as in Barbara Blatchley’s book. The author advances five dualities that underly our (dysfunctional) relation to chance: individual vs. collective, randomness vs. meaning, foresight vs. insight, uniformity vs. variability, and disruption vs. opportunity.

“News programmes clearly understand that the testimonies of individuals draw better audiences than the summaries of statisticians.” (p. xvii)

Some of the nice features of the book  are (a) the description of a probabilistic problem at the beginning of each chapter, to be solved at the end, (b) the use of simulation experiments, represented by coloured pixels over a grey band crossing the page, including a section on pseudorandom generators [which is less confusing that the quote below may indicate!], (c) taking full advantage of the quincunx apparatus, and (d) very few apologies for getting into formulas. And even a relevant quote of Taleb’s Black Swan about the ludic fallacy. On the other hand, the author spends quite a large component of the book on chance games, exhibiting a ludic tendency! And contemplates biased coins, while he should know better! The historical sections may prove too much for both informed and uninformed readers. (However, I learned that the UK Government had used a form of lottery to pay interests on premium bonds.) And the later parts are less numerical and quantified, even though the author brings in the micromort measurement [invented by Ronald Howard and] favoured by David Spiegelhalter. Who actually appears to have inspired several other sections, like the one on coincidences (which remains quite light in its investigation!). I finished the book rather quickly by browsing though mostly anecdotes and a lesser feel of a unified discourse. I did not find the attempt to link with the COVID pandemic, which definitely resets our clocks on risk, particularly alluring…

“People go to a lot of trouble to generate truly random numbers—sequences that are impossible to predict.” (p.66)

The apparition of the Normal distribution is somewhat overdone and almost mystical, if the tone gets more reasonable by the end of the corresponding chapter.

“…combining random numbers from distributions that really have no business being added together (…) ends up with a statistic that actually fits the normal distribution quite well.” (p.83)

The part about Bayes and Bayesian reasoning does not include any inference, with a rather duh! criticism of prior modelling.

“If you are tempted to apply a group statistic derived from a broad analysis to a more narrow purpose, you run the risk of making an unfair judgement.” (p.263)

The section about Xenakis’ musical creations as a Markov process was most interesting (and novel to me). I also enjoyed the shared cultural entries, esp. literary ones. Like citing the recent Chernobyl TV drama. Or Philip K. Dick’s Do Androids Dream of Electric Sheep? Or yet Monty Python’s Life of Brian. Overall, there is enough trivia and engagement to keep reading the book till its end!

learning base R [book review]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , on February 26, 2022 by xi'an

This second edition of an introductory R book was sent to me by the author for a potential CHANCE book review.  As there are many (many) books in the same spirit, the main question behind my reading it (in one go) was on the novelty it brings. The topics Learning Base R covers are

  • arithmetics with R
  • data structures
  • built-in and user-written R functions
  • R utilities
  • more data structures
  • comparison and coercion
  • lists and data frames
  • resident R datasets
  • R interface
  • probability calculations in R
  • R graphics
  • R programming
  • simulations
  • statistical inference in R
  • linear algebra
  • use of R packages

within as many short chapters. The style is rather standard, that is, short paragraphs with mostly raw reproductions of line commands and their outcome. Sometimes a whole page long of code examples (if with comments). All in all I feel there are rather too few tables when compared with examples, at least for my own taste. The exercises are mostly short and, while they vary in depth, they show that the book is rather intended for students with some mathematical background (e.g., with a chapter on complex numbers and another one on linear algebra that do not seem immediately relevant for most intended readers). Or more than that, when considering one (of several) exercise (19.30) on the Black-Scholes process that mentions Brownian motion. Possibly less appealing for would-be statisticians.

I also wonder at the pedagogical choice of not including and involving more clearly graphical interfaces like R studio as students are usually not big fans of “old-style” [their wording not mine!] line command languages. For instance, the chapter on packages would have benefited from this perspective. Nothing on Rmarkdown either. Apparently nothing on handling big data, more advanced database manipulation, the related realistic dangers of memory freeze and compulsory reboot, the intricacies of managing different directories and earlier sessions, little on the urgency of avoiding loops (p.233) by vectorial programming, a paradoxically if function being introduced after ifelse, and again not that much on statistics (with density only occurring in exercises).The chapter on customising R graphics may possibly scare the intended reader when considering the all-in-one example of p.193! As we advance though the book, the more advanced examples often are fairly standard programming ones (found in other language manuals) like creating Fibonacci numbers, implementing Eratosthenes sieve, playing the Hanoi Tower game… (At least they remind me of examples read in the language manuals I read as a student.) The simulation chapter could have gone into the one (Chap. 19) on probability calculations, rather than superfluously redefining standard distributions. (Except when defining a random number as a uniformly random number (p.162).)  This chapter also spends an unusual amount of space on linear congruencial pseudo-random generators, while missing to point out the trivia that the randu dataset mentioned twice earlier is actually an outcome from the infamous RANDU Fortran generator. The following section in that chapter is written in such a way that it may give the wrong impression that one can find the analytic solution from repeated Monte Carlo experiments and hence the error. Which is rarely the case, even in finite environments with rational expectations, as one usually does not know of which unit fraction the expectation should be a multiple of. (Remember the Squid Games paradox!) And no mention is made of the prescription of always returning an error estimate along with the numerical approximation. The statistics chapter is obviously more developed, with descriptive statistics, ecdf, but no bootrstap, a t.test curiously applied to the Michelson measurements of the speed of light (how could it be zero?!), ANOVA, regression handled via lm and glm, time series analysis by ARIMA models, which I hope will not be the sole exposure of readers to these concepts.

In conclusion, there is nothing critically wrong with this manual introducing R to newcomers and I would not mind having my undergraduate students reading it (rather than our shorter and home-made handout, polished along the years) before my first mathematical statistics lab. However I do not find it massively innovative in its presentation or choice of concept, even though the most advanced examples are not necessarily standard, and may not appeal to all categories of students.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]

%d bloggers like this: