Théorie analytique des probabilités
The Brazilian society for Bayesian Analysis (ISBrA, whose annual meeting is taking place at this very time!) asked me to write a review on Pierre Simon Laplace’s book, Théorie Analytique des Probabilités, a book that was initially published in 1812, exactly two centuries ago. I promptly accepted this request as (a) I had never looked at this book and so this provided me with a perfect opportunity to do so, (b) while in Vancouver, Julien Cornebise had bought for me a 1967 reproduction of the 1812 edition, (c) I was curious to see how much of the book had permeated modern probability and statistics or, conversely, how much of Laplace’s perspective was still understandable by modern day standards. (Note that the link on the book leads to a free version of the 1814, not 1812, edition of the book, as free as the kindle version on amazon.)
“Je m’attache surtout, à déterminer la probabilité des causes et des résultats indiqués par événemens considérés en grand nombre.” P.S. Laplace, Théorie Analytique des Probabilités, page 3
First, I must acknowledge I found the book rather difficult to read and this for several reasons: (a) as is the case for books from older times, the ratio text-to-formulae is very high, with an inconvenient typography and page layout (ar least for actual standards), so speed-reading is impossible; (b) the themes offered in succession are often abruptly brought and uncorrelated with the previous ones; (c) the mathematical notations are 18th-century, so sums are indicated by S, exponentials by c, and so on, which again slows down reading and understanding; (d) for all of the above reasons, I often missed the big picture and got mired into technical details until they made sense or I gave up; (e) I never quite understood whether or not Laplace was interested in the analytics like generating functions only to provide precise numerical approximations or for their own sake. Hence a form of disappointment by the end of the book, most likely due to my insufficient investment in the project (on which I mostly spent an Amsterdam/Calgary flight and jet-lagged nights at BIRS…), even though I got excited by finding the bits and pieces about Bayesian estimation and testing.
“Sa théorie est une des choses les plus curieuses et les plus utiles que l’on ait trouvées sur les suites.” P.S. Laplace, Théorie Analytique des Probabilités, page 8
The Livre Premier of Théorie Analytique des Probabilités is about generating functions (Calcul des Fonctions géneratrices). As such, it is not directly of interest, focusing on finite difference equations. (There is an interesting connection with de Moivre, incidentally, since he used generating functions to derive binomial formulas. He is acknowledged in Laplace’s preface by the above quote.)
“La théorie des probabilités consiste à réduire tous les événemens qui peuvent avoir lieu dans une circonstance donnée à un certain nombre de cas également possibles.” P.S. Laplace, Théorie Analytique des Probabilités, page 178
The Livre Second is about probability theory, first about urn type problems, then about asymptotic approximations. The introduction to this second part reflects the famous determinism of Laplace, for whom randomness is simply l’expression de notre ignorance (the expression of our ignorance, page 177)… The initial pages contain the basics of probability, like the chain rule, the product rule, the conditional probability and… Bayes’ rule, even though it is not called as such. I did not find any mention of Thomas Bayes in the book. However, when looking at the on-line version, I realised to my uttermost dismay that the 1814 edition has changed quite significantly, with an historical introduction to the theory of probability, incl. the mention of Bayes. (Thus, the changes from one edition to the next were not just restricted to the removal of the dedication to Napoléon-le-Grand [not longer appropriate after Waterloo and the restauration of the monarchy!] and to the change from Chancellier du Sénat [under Napoléon] to Pair du Royaume [under Louis XVIII], reflecting the quicksilver turncoat politics of Laplace!) An interesting syntactic point is the paragraph where Laplace introduces the notion of expectation (in the sense of Great Expectations), along with fears, and as in the Essai philosophique, distinguishes between mathematical expectation and moral expectation. (He later acknowledge Bernoulli’s priority, see below.)
“Nous traiterons d’abord les questions dans lesquelles les probabilités des événemens simples, sont données; nous considérerons ensuite celles dans lesquelles ces probabilités sont inconnues, et doivent être déterminées par les événemens observés.” P.S. Laplace, Théorie Analytique des Probabilités, page 188
The above quote is the introduction to Chapter II which essentially consists in a sequence of combinatorial problems, solved by polynomial decompositions and approximated by the finite difference formulae of the first Livre. (Despite this enticing quote, the chapter does not cover the statistical part.) While the accumulation of lottery and urn problems is not exactly fascinating, some entries highlight Laplace’s analytical skills. For instance, a convoluted urn problem leads to an equally convoluted integral (page 222)
where Laplace uses a Laplace’s approximation to replace (0) with
for n and rn large. The cdf is used in a convoluted (if labeled as “très-simple”, page 264!) derivation of an expectation of several variables. The chapter concludes with reflections on an optimal voting system that relates to Condorcet‘s (although no mention is made of him in the book).
“On peut encore, par l’analyse des probabilités, vérifier l’existence ou l’influence de certaines causes dont on a cru remarquer l’action sur les êtres organisés.” P.S. Laplace, Théorie Analytique des Probabilités, page 358
Chapter III moves to asymptotic approximations and to the law of large numbers for frequencies, “cet important théorème” (page 275). The beginning of this chapter shows that the variation of the empirical frequency around the corresponding probability is of order 1/ √n, with a normal approximation to the coverage of the confidence interval.
“On peut reconnaître l’effet très-petit d’une cause constante, par une longue suite d’observations dont les erreurs peuvent excéder cette effet lui-même.” P.S. Laplace, Théorie Analytique des Probabilités, page 352
Chapter IV extends the above law of large numbers to a sum of iid variables. Laplace then remarks that the most likely error is zero (which simply means that the mode of the standard normal distribution is indeed zero). This chapter also contains a derivation of the average as being the estimator that minimises the average error and as being the least square estimator (page 321). (A “true” discovery: Laplace was programming in C as he started his sums at i=0 rather than at i=1, see page 313!) I think Laplace uses a Fourier transform to derive the distribution of a weighted sum (page 314). He then proceeds to generalise this optimality result to a bivariate quantity, obtaining again the least square estimate and computing a bivariate Gaussian density on the way. And then…comes the major step,! namely Laplace’s derivation of a posterior distribution (page 334):
(with my notations), thus using a flat prior on the location parameter! This fundamental step is compounded by the introduction of a (not yet) Bayes estimator minimising posterior absolute error loss and found to be the median of the posterior distribution. In the next pages, Laplace attempts to find the MAP (which is also the maximum likelihood estimator in this case), as an approximation to the posterior median (page 336). From therein, he moves to identify the distribution for which the MAP is also the (arithmetic) average, ending up with the normal distribution (page 338). (This result was to be extended by J.M. Keynes to different types of estimators.) The chapter concludes with a defense of the arithmetic mean as a limiting Bayes estimator that does not depend on the law of the errors.
“Pour déterminer avec quelle probabilité cette cause est indiquée, concevons que cette cause n’existe point.” P.S. Laplace, Théorie Analytique des Probabilités, page 350
Chapter V starts with the computation of a p-value, nothing less! Laplace analyses the likelihood (vraisemblance) of a non-zero effect by looking at the cdf of the observation under the null (page 361). The following pages discuss Laplace’s analysis of the irregularities in celestial trajectories, like the perturbations between Saturn and Jupiter. It argues in a philosophical if un-Popperian way about the importance of probabilistic analysis (read statistics) for uncovering scientific facts (page 358).
“Laplace actually used the theory of probabilities as a method of discovery.” A. Morgan, Dublin Review, 1837
In Chapter VI, “De la probabilité des causes et des événemens futurs, tirés des événemens observés“, Laplace expands his Bayesian (or Laplacian) perspective for drawing inference about unknown probabilities. He uses a uniform prior (with an interesting argument transferring the prior into the likelihood as to always consider this case, page 364). He then derives a normal approximation to the posterior (first term of the Laplace approximation!, page 367). This chapter also contains the famous study on the proportion þ of female births in Paris, using an approximation to the beta integral to show that the (posterior) probability that þ is larger than 1/2 is negligible (“d’une petitesse excessive“, page 380). Laplace also computes the posterior probability that the probability of a male birth in London is larger than in Paris, which he finds equal to 1-1/328269 (using a continued fraction!). He then moves to the applications of these techniques to mortality tables and insurances, exhibiting there a thematic connection with Abraham de Moivre (and maybe Bayes!). The chapter concludes by a computation of the posterior (or predictive!) probability that 1-þ will remain larger than 1/2 in the next century, obtaining a value of 0,782.
“The theory of probabilities draws a remarkable distinction between observations which have been made, and those which are to be made.” A. Morgan, Dublin Review, 1837
Chapter VII is a short chapter on biased coins and compounded experiments, not directly related with Bayesian perspectives. Chapter VIII is similarly short, reproducing earlier normal approximations on averages of life durations. It also contains an interesting study on the impact of removing (by vaccination) the impact of smallpox on the overall death rate. Chapter IX deals with expectations of simple functions for binomial experiments and with their normal approximation, again exhibiting the above link with de Moivre on life insurrances.
Chapter X returns to the notion of moral expectation mentioned both earlier and in the Essai Philosophique. The core (to solving the Saint Petersburg paradox) is to use log(x) instead of x as a utility function, following Bernoulli’s derivation (mentioned on page 439).
“In reviewing the general design of the work of Laplace, we desire to make the description of a book mark the present state of a science.” A. Morgan, Dublin Review, 1837
In conclusion, the Théorie Analytique des Probabilités provides a fascinating historical perspective on Laplace’s genius in framing probability and statistics within mathematical analysis and in deriving numerical approximations to intractable integrals. As put by Augustus de Morgan in a praising if often critical and sometimes hilarious review of the book, “Théorie des Probabilités is the Mont Blanc of mathematical analysis”. (de Morgan considers that the French national school of mathematics as a whole neglects to credit predecessors. It is quite true that it is impossible to gather which results are original and which are not in Théorie Analytique des Probabilités. de Morgan also thinks that the first part [Livre premier] on generating functions is mostly useless for the second part. And that the introduction [in the 1814 edition] is the Essai Philosophique, which I do not believe [even though there are repetitions]. Interestingly, de Morgan also spends quite some time on the notion of moral expectation.)
As opposed to Bayes’ short essay, Laplace’s book leads to a global vision of the role and practice of probability theory, as it was then understood at the beginning of the 19th Century, and it can be argued that Théorie Analytique des Probabilités shaped the field (or fields) for close to a hundred years.
March 10, 2017 at 6:13 pm
[…] draw a connection with Harold Jeffreys’ distinction between testing and estimation, based upon Laplace’s succession rule. Unbearably slow succession law. Which is well-taken if somewhat specious since this is a testing […]
March 29, 2012 at 7:11 am
I made the comments into an arXiv document. And send it to the ISBrA Bulletin.
March 26, 2012 at 9:46 pm
Laplace’s Essay is the popular version, available in English. It seems to me that much neo-classical economics and financial mathematics could have done with some of his insights, prior to the financial crash of 2008. If the state of the art has advanced since his time, the popular understanding seems not to have.
I blog on the Essay at http://djmarsay.wordpress.com/bibliography/rationality-and-uncertainty/probability/laplaces-essay-on-probabilities/ .
March 26, 2012 at 2:36 pm
As you, I was impressed by the second page (364) of Chapter VI in the « Théorie Analytique » on how Laplace handled the issue of non uniform prior distribution by considering the prior as an additional set of data (z) independent of the observed one (x). It leads to easy and useful interpretations for conjugate priors and may be viewed to some extent as the basic idea behind partial (intrinsic and fractional) Bayes Factors.
March 26, 2012 at 3:00 pm
Thank you Jean-Louis. You explained the issue much more clearly than I did.