Re-reading An Essay towards solving a Problem in the Doctrine of Chances “We ought to estimate the chance that the probability for the happening of an event perfectly unknown, should lie between any two named degrees of probability, antecedently to any experiment made about it.” Letter of R. Price to J. Canton, Nov. 10, 1763

On a lazy and sunny Sunday afternoon, I re-read Thomas Bayes’ 1763 Essay. (It is available in LaTex, courtesy of Peter Lee.) The major part of the Essay is actually written by Richard Price, Bayes’ contribution being from page 376 to page 399. Most of the introduction by Price (in the form of a letter to John Canton) rephrases Bayes’ findings, but he stresses that Bayes set a “sure foundation for all our reasonings concerning past facts”. In the spirit of the time, he cannot prevent from relating the uncovering of “fixt laws according to which events happened” to the “existence of the Deity”. He also perceives Bayes’ rule as “solving the converse problem” from De Moivre’s Laws of Chances. At last, he stresses that, although chance should relate to past events, while probability relates to future events, the distinction should not impact conditional probability.

“Given the number of times in which an unknown event has happened and failed; Required the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named.” Th. Bayes

The Essay itself consists in (a) a “brief demonstration of the general laws of chance”, (b) the derivation of Bayes’ posterior distribution for the uniform-binomial problem, (c) the computation of the posterior probability of an arbitrary interval. The first part is a rewording of De Moivre’s Laws of Chance, in particular recalling the definition of a conditional probability. Maybe the definition of the probability is worth quoting

5. The probability of any event is the ratio between the value at which an expectation depending on the happening of the event ought to be computed and the value of the thing expected upon it’s happening.

because it actually defines a probability as a by-product of the expected number of occurrences within a binomial experiment. (There is therefore nothing frequentist in this definition!) The main part (and the huge novelty) in the Essay is the derivation of the Beta posterior. Surprisingly, the setup is introduced very abruptly (in that nowhere before were those balls mentioned!):

Postulate. 1. Suppose the square table or plane ABCD to be so made and levelled, that if either of the balls o or W be thrown upon it, there shall be the same probability that it re{\st}s upon any one equal part of the plane as another, and that it must necessarily rest somewhere upon it.

and then the derivation starts with a two-page derivation that the prior (uniform) cdf is the uniform cdf.  The next result is Prop. 8  that gives the joint probability that the binomial probability is between f and b and that the binomial experiment gives x=p:

the probability the point o should fall between f and b, any two points named in the line AB, and withall that the event M should happen p times and fail q in p+q trials, is the ratio of fghikmb, the part of the figure BghikmA intercepted between the perpendiculars fg, bm raised upon the line AB, to CA the square upon AB.

where the curve is y=xp(1-x)q. The next proposition is then Bayes’ rule, still expressed in terms of surface ratio as above,

The same things supposed, I guess that the probability of the event M lies somewhere between 0 and the ratio of Ab to AB, my chance to be in the right is the ratio of Abm to AiB.

but clearly set within the Beta(p+1,q+1) distribution [in modern terms]. Bayes then inserts a scholium where he tries to justify the use of the uniform prior, however I do not see the validity of the reasoning since he seems to argue in favour of a uniform distribution on the marginal distribution of the binomial experiment:

I have no reason to think that, in a certain number of trials, it should rather happen any one possible number of times than another.

The last part of the Essay per se is about deriving a closed form formula for the Beta integral, a feat achieved in Rule I. $P(X\le \theta\le x|p,q) = (p+q){p+q\choose p}\left\{\frac{X^{p+1}}{p+1}-q\frac{X^{p+2}}{p+2}+\cdots\right.$ $\left.\cdots-\frac{x^{p+1}}{p+1}+q\frac{x^{p+2}}{p+2}-\cdots\right\}$

in slightly more modern notations. The 18 remaining pages are written by Richard Price, who first reproduces Bayes’ approximations to the above integral with improvements of his own, then illustrates the performances of such approximations in specific cases, with the astounding fact that the probability covered by the approximation is centred at the MLE: $P(|\theta-p/(p+q)|\le z)$

and not at the Bayes posterior mean. This could be extrapolated as one of the earliest confidence sets, except of course that the probability is over the parameter space. I note that Price also derives [409-410] as a consequence of Bayes’ calculations what is now know as Laplace’s succession rule…! Besides the derivation of the posterior distribution itself, which must be a considerable feat for the time, the attention to computational issues is highly commendable, as it would become a constant theme of Bayesian studies for centuries!!!

4 Responses to “Re-reading An Essay towards solving a Problem in the Doctrine of Chances”

1. Théorie analytique des probabilités « Xi'an's Og Says:

[…] opposed to Bayes’ short essay, Laplace’s book leads to a global vision of the role and practice of probability theory, as […]

2. May I believe I am a Bayesian?! « Xi'an's Og Says:

[…] comments on this blog about the unpleasant aspects of being associated with one character, esp. the mysterious Reverent Bayes!) But this is not my main […]

3. the theory that would not die… « Xi'an's Og Says:

[…] Bayes’s life, incl. his passage in Edinburgh, and a nice non-mathematical description of his ball experiment, the next chapter is about “the man who did everything”, …, yes indeed, […]

4. Bayes redux « Xi'an's Og Says:

[…] Chopin pointed out to me this gem of an arXiv paper where the authors bravely reinvent Thomas Bayes‘ 1763 paper, i.e. they managed to derive the posterior distribution on the probability […]

This site uses Akismet to reduce spam. Learn how your comment data is processed.