## let the evidence speak [book review]

Posted in Books, Kids, Statistics with tags , , , , , , , , , , on December 17, 2018 by xi'an

This book by Alan Jessop, professor at the Durham University Business School,  aims at presenting Bayesian ideas and methods towards decision making “without formula because they are not necessary; the ability to add and multiply is all that is needed.” The trick is in using a Bayes grid, in other words a two by two table. (There are a few formulas that survived the slaughter, see e.g. on p. 91 the formula for the entropy. Contained in the chapter on information that I find definitely unclear.) When leaving the 2×2 world, things become more complicated and the construction of a prior belief as a probability density gets heroic without the availability of maths formulas. The first part of the paper is about Likelihood, albeit not the likelihood function, despite having the general rule that (p.73)

belief is proportional to base rate x likelihood

which is the book‘s version of Bayes’ (base?!) theorem. It then goes on to discuss the less structure nature of prior (or prior beliefs) against likelihood by describing Tony O’Hagan’s way of scaling experts’ beliefs in terms of a Beta distribution. And mentioning Jaynes’ maximum entropy prior without a single formula. What is hard to fathom from the text is how can one derive the likelihood outside surveys. (Using the illustration of 1963 Oswald’s murder by Ruby in the likelihood chapter does not particularly help!) A bit of nitpicking at this stage: the sentence

“The ancient Greeks, and before them the Chinese and the Aztecs…”

is historically incorrect since, while the Chinese empire dates back before the Greek dark ages, the Aztecs only rule Mexico from the 14th century (AD) until the Spaniard invasion. While most of the book sticks with unidimensional parameters, it also discusses more complex structures, for which it relies on Monte Carlo, although the description is rather cryptic (use your spreadsheet!, p.133). The book at this stage turns into a more story-telling mode, by considering for instance the Federalist papers analysis by Mosteller and Wallace. The reader can only follow the process of assessing a document authorship for a single word, as multidimensional cases (for either data or parameters) are out of reach. The same comment applies to the ecology, archeology, and psychology chapters that follow. The intermediary chapter on the “grossly misleading” [Court wording] of the statistical evidence in the Sally Clark prosecution is more accessible in that (again) it relies on a single number. Returning to the ban of Bayes rule in British courts:

In the light of the strong criticism by this court in the 1990s of using Bayes theorem before the jury in cases where there was no reliable statistical evidence, the practice of using a Bayesian approach and likelihood ratios to formulate opinions placed before a jury without that process being disclosed and debated in court is contrary to principles of open justice.

the discussion found in the book is quite moderate and inclusive, in that a Bayesian analysis helps in gathering evidence about a case, but may be misunderstood or misused at the [non-Bayesian] decision level.

In conclusion, Let the Evidence Speak is an interesting introduction to Bayesian thinking, through a simplifying device, the Bayes grid, which seems to come from management, with a large number of examples, if not necessarily all realistic and some side-stories. I doubt this exposure can produce expert practitioners, but it makes for an worthwhile awakening for someone “likely to have read this book because [one] had heard of Bayes but were uncertain what is was” (p.222). With commendable caution and warnings along the way.

## Conditional love [guest post]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , on August 4, 2015 by xi'an

[When Dan Simpson told me he was reading Terenin’s and Draper’s latest arXival in a nice Bath pub—and not a nice bath tub!—, I asked him for a blog entry and he agreed. Here is his piece, read at your own risk! If you remember to skip the part about Céline Dion, you should enjoy it very much!!!]

Probability has traditionally been described, as per Kolmogorov and his ardent follower Katy Perry, unconditionally. This is, of course, excellent for those of us who really like measure theory, as the maths is identical. Unfortunately mathematical convenience is not necessarily enough and a large part of the applied statistical community is working with Bayesian methods. These are unavoidably conditional and, as such, it is natural to ask if there is a fundamentally conditional basis for probability.

Bruno de Finetti—and later Richard Cox and Edwin Jaynes—considered conditional bases for Bayesian probability that are, unfortunately, incomplete. The critical problem is that they mainly consider finite state spaces and construct finitely additive systems of conditional probability. For a variety of reasons, neither of these restrictions hold much truck in the modern world of statistics.

In a recently arXiv’d paper, Alexander Terenin and David Draper devise a set of axioms that make the Cox-Jaynes system of conditional probability rigorous. Furthermore, they show that the complete set of Kolmogorov axioms (including countable additivity) can be derived as theorems from their axioms by conditioning on the entire sample space.

This is a deep and fundamental paper, which unfortunately means that I most probably do not grasp it’s complexities (especially as, for some reason, I keep reading it in pubs!). However I’m going to have a shot at having some thoughts on it, because I feel like it’s the sort of paper one should have thoughts on. Continue reading

## 17 equations that changed the World (#2)

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , , on October 16, 2012 by xi'an

(continuation of the book review)

If you placed your finger at that point, the two halves of the string would still be able to vibrate in the sin 2x pattern, but not in the sin x one. This explains the Pythagorean discovery that a string half as long produced a note one octave higher.” (p.143)

The following chapters are all about Physics: the wave equation, Fourier’s transform and the heat equation, Navier-Stokes’ equation(s), Maxwell’s equation(s)—as in  The universe in zero word—, the second law of thermodynamics, E=mc² (of course!), and Schrödinger’s equation. I won’t go so much into details for those chapters, even though they are remarkably written. For instance, the chapter on waves made me understand the notion of harmonics in a much more intuitive and lasting way than previous readings. (This chapter 8 also mentions the “English mathematician Harold Jeffreys“, while Jeffreys was primarily a geophysicist. And a Bayesian statistician with major impact on the field, his Theory of Probability arguably being the first modern Bayesian book. Interestingly, Jeffreys also was the first one to find approximations to the Schrödinger’s equation, however he is not mentioned in this later chapter.) Chapter 9 mentions the heat equation but is truly about Fourier’s transform which he uses as a tool and later became a universal technique. It also covers Lebesgue’s integration theory, wavelets, and JPEG compression. Chapter 10 on Navier-Stokes’ equation also mentions climate sciences, where it takes a (reasonable) stand. Chapter 11 on Maxwell’s equations is a short introduction to electromagnetism, with radio the obvious illustration. (Maybe not the best chapter in the book.) Continue reading