the first Bayesian

Posted in Statistics with tags , , , , , , , on February 20, 2018 by xi'an

In the first issue of Statistical Science for this year (2018), Stephen Stiegler pursues the origins of Bayesianism as attributable to Richard Price, main author of Bayes’ Essay. (This incidentally relates to an earlier ‘Og piece on that notion!) Steve points out the considerable inputs of Price on this Essay, even though the mathematical advance is very likely to be entirely Bayes’. It may however well be Price who initiated Bayes’ reflections on the matter, towards producing a counter-argument to Hume’s “On Miracles”.

“Price’s caution in addressing the probabilities of hypotheses suggested by data is rare in early literature.”

A section of the paper is about Price’s approach data-determined hypotheses and to the fact that considering such hypotheses cannot easily fit within a Bayesian framework. As stated by Price, “it would be improbable as infinite to one”. Which is a nice way to address the infinite mass prior.

The Richard Price Society

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , on November 26, 2015 by xi'an

As an item of news coming to me via ISBA News, I learned of the Richard Price Society and of its endeavour to lobby for the Welsh government to purchase Richard Price‘s birthplace as an historical landmark. As discussed in a previous post, Price contributed so much to Bayes’ paper that one may wonder who made the major contribution. While I am not very much inclined in turning old buildings into museums, feel free to contact the Richard Price Society to support this action! Or to sign the petition there. Which I cannot resist but  reproduce in Welsh:

Datblygwch Fferm Tynton yn Ganolfan Ymwelwyr a Gwybodaeth

​Rydym yn galw ar Lywodraeth Cymru i gydnabod cyfraniad pwysig Dr Richard Price nid yn unig i’r Oes Oleuedig yn y ddeunawfed ganrif, ond hefyd i’r broses o greu’r byd modern yr ydym yn byw ynddo heddiw, a datblygu ei fan geni a chartref ei blentyndod yn ganolfan wybodaeth i ymwelwyr lle gall pobl o bob cenedl ac oed ddarganfod sut mae ei gyfraniadau sylweddol i ddiwinyddiaeth, mathemateg ac athroniaeth wedi dylanwadu ar y byd modern.

Clockers [book review]

Posted in Books, Travel with tags , , , , , , , , , on March 15, 2014 by xi'an

Throughout my recent trip to Canada, I read bits and pieces of Clockers by Richard Price and I finished reading it last Sunday. It is an impressive piece of literature and I am surprised I was not aware of its existence until amazon.com suggested it to me (as I was checking for recent books by another Richard, Richard Morgan!). Guessing from the summary it could be of interest and from comments it was sort of a classic, I ordered it more or less on a whim (given a comfortable balance on my amazon.com account, thanks to ‘Og’s readers!) It took me a few pages to realise the plot was deeply set in the 1990’s, not only because this was the high of the crack epidemics, but also since the characters (drug dealers and policemen) therein are all using beepers, instead of cellphones, and street phone booths).

“It’s like a math problem. Juan got whacked at point X, he drove away losing blood at the rate of a pint every ninety seconds. He was driving forty-five miles an hour and he bought the farm two miles inside the tunnel (…) So for ten points, [who] in what New Jersey town did Juan?” Clockers (p.272)

The plot of Clockers is vaguely a detective story as an aging and depressed homicide officer, Rosso, hunts the murderer of a drug dealer, being convinced from the start that the self-declared murderer Victor did not do it. In parallel, and somewhat more closely, the book follows the miserable plight and thoughts and desires of Victor’s brother, Strike, who is head of a local crack dealing network, under the domination of the charismatic and berserk Rodney Little… But the resolution of the crime matters very little, much less than the exposure of the deadly economics of the drug traffic in inner cities (years before Freakonomics!), of the constant fight of single mothers to bring food and structure to their dysfunctional families, to the widespread recourse to moonlighting, and above all to the almost physical impossibility to escape one’s environment (even for smart and decent kids like Victor and, paradoxically enough, the drug-dealing Strike) by lack of prospect and exposure to anything or anywhere else, as well as social pressure, early pregnancies and gang-related micro-partitioning of cities.

When I mentioned Clockers to Andrew, he told me that he also liked it very much but that the characters were not quite “real”. I somewhat agree in that, while the economics, the sociology and the practice of drug-dealing sound very accurately reproduced (for all I know!), the characters are more caricaturesque or picturesque than natural. The stomach disease of Strike sounds too much like an allegory of both his schizophrenic split between running the drug trade and looking for a definitive quit, while the sacrifice of his brother makes little sense, except as a form either of suicide or of escape from an environment he can no longer stand. What is most surprising is that Richard Price (just like Michael Crichton) is  a practised screenwriter (who collaborated to Spike Lee’s 1995 Clockers). So he knows how to run an efficient story with convincing characters and plot(s). Hence my little theory of a picaresque novel… (Here is Jim Shepard’s enthusiastic review of Clockers. With the definitely accurate title of “Sympathy for the dealer”.)

Bayes 250th versus Bayes 2.5.0.

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on July 20, 2013 by xi'an

More than a year ago Michael Sørensen (2013 EMS Chair) and Fabrizzio Ruggeri (then ISBA President) kindly offered me to deliver the memorial lecture on Thomas Bayes at the 2013 European Meeting of Statisticians, which takes place in Budapest today and the following week. I gladly accepted, although with some worries at having to cover a much wider range of the field rather than my own research topic. And then set to work on the slides in the past week, borrowing from my most “historical” lectures on Jeffreys and Keynes, my reply to Spanos, as well as getting a little help from my nonparametric friends (yes, I do have nonparametric friends!). Here is the result, providing a partial (meaning both incomplete and biased) vision of the field.

Since my talk is on Thursday, and because the talk is sponsored by ISBA, hence representing its members, please feel free to comment and suggest changes or additions as I can still incorporate them into the slides… (Warning, I purposefully kept some slides out to preserve the most surprising entry for the talk on Thursday!)

abstract for “Bayes’ Theorem: then and now”

Posted in Books, Mountains, Statistics, Travel, University life with tags , , , , , , , , , on March 19, 2013 by xi'an

Here is my abstract for the invited talk I will give at EMS 2013 in Budapest this summer (the first two banners were sites of EMS 2013 conferences as well, which came above the European Meeting of Statisticians on a Google search for EMS 2013):

What is now called Bayes’ Theorem was published and maybe mostly written by Richard Price in 1763, 250 ago. It was re-discovered independently (?) in 1773 by Pierre Laplace, who put it to good use for solving statistical problems, launching what was then called inverse probability and now goes under the name of Bayesian statistics. The talk will cover some historical developments of Bayesian statistics, focussing on the controversies and disputes that marked and stil mark its evolution over those 250 years, up to now. It will in particular address some arguments about prior distributions made by John Maynard Keynes and Harold Jeffreys, as well as divergences about the nature of testing by Dennis Lindley, James Berger, and current science philosophers like Deborah Mayo and Aris Spanos, and misunderstandings on Bayesian computational issues, including those about approximate Bayesian computations (ABC).

I was kindly asked by the scientific committee of EMS 2013 to give a talk on Bayes’ theorem: then and now, which suited me very well for several reasons: first, I was quite interested in giving an historical overview, capitalising on earlier papers about Jeffreys‘ and Keynes‘ books, my current re-analysis of the Jeffreys-Lindley’s paradox, and exchanges around the nature of Bayesian inference. (As you may guess from the contents of the abstract, even borrowing from the article about Price in Significance!) Second, the quality of the programme is definitely justifying attending the whole conference. And not only for meeting again with many friends. At last, I have never visited Hungary and this is a perfect opportunity for starting my summer break there!

Price’s theorem?

Posted in Statistics with tags , , , , , , on March 16, 2013 by xi'an

A very interesting article by Martyn Hooper in Significance Feb. 2013 issue I just received. (It is available on-line for free.) It raises the question as to how much exactly Price contributed to the famous Essay… Given the percentage of the Essay that can be attributed to Price with certainty (Bayes’ part stops at page 14 out of 32 pages), given the lack of the original manuscript by Bayes, given the delay between the composition of this original manuscript (1755?), its delivery to Price (1761?) and its publication in 1763, given the absence of any other document published by Bayes on the topic, I tend to concur with Martyn Hooper (and Sharon McGrayne) that Price contributed quite significantly to the 1763 paper. Of course, it would sound quite bizarre to start calling our approach to Statistics Pricean or Pricey (or even Priceless!) Statistics, but this may constitute one of the most striking examples of Stigler’s Law of Eponymy!

Re-reading An Essay towards solving a Problem in the Doctrine of Chances

Posted in Books, Statistics with tags , , , on May 2, 2011 by xi'an

“We ought to estimate the chance that the probability for the happening of an event perfectly unknown, should lie between any two named degrees of probability, antecedently to any experiment made about it.” Letter of R. Price to J. Canton, Nov. 10, 1763

On a lazy and sunny Sunday afternoon, I re-read Thomas Bayes’ 1763 Essay. (It is available in LaTex, courtesy of Peter Lee.) The major part of the Essay is actually written by Richard Price, Bayes’ contribution being from page 376 to page 399. Most of the introduction by Price (in the form of a letter to John Canton) rephrases Bayes’ findings, but he stresses that Bayes set a “sure foundation for all our reasonings concerning past facts”. In the spirit of the time, he cannot prevent from relating the uncovering of “fixt laws according to which events happened” to the “existence of the Deity”. He also perceives Bayes’ rule as “solving the converse problem” from De Moivre’s Laws of Chances. At last, he stresses that, although chance should relate to past events, while probability relates to future events, the distinction should not impact conditional probability.

“Given the number of times in which an unknown event has happened and failed; Required the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named.” Th. Bayes

The Essay itself consists in (a) a “brief demonstration of the general laws of chance”, (b) the derivation of Bayes’ posterior distribution for the uniform-binomial problem, (c) the computation of the posterior probability of an arbitrary interval. The first part is a rewording of De Moivre’s Laws of Chance, in particular recalling the definition of a conditional probability. Maybe the definition of the probability is worth quoting

5. The probability of any event is the ratio between the value at which an expectation depending on the happening of the event ought to be computed and the value of the thing expected upon it’s happening.

because it actually defines a probability as a by-product of the expected number of occurrences within a binomial experiment. (There is therefore nothing frequentist in this definition!) The main part (and the huge novelty) in the Essay is the derivation of the Beta posterior. Surprisingly, the setup is introduced very abruptly (in that nowhere before were those balls mentioned!):

Postulate. 1. Suppose the square table or plane ABCD to be so made and levelled, that if either of the balls o or W be thrown upon it, there shall be the same probability that it re{\st}s upon any one equal part of the plane as another, and that it must necessarily rest somewhere upon it.

and then the derivation starts with a two-page derivation that the prior (uniform) cdf is the uniform cdf.  The next result is Prop. 8 [388] that gives the joint probability that the binomial probability is between f and b and that the binomial experiment gives x=p:

the probability the point o should fall between f and b, any two points named in the line AB, and withall that the event M should happen p times and fail q in p+q trials, is the ratio of fghikmb, the part of the figure BghikmA intercepted between the perpendiculars fg, bm raised upon the line AB, to CA the square upon AB.

where the curve is y=xp(1-x)q.

The next proposition is then Bayes’ rule, still expressed in terms of surface ratio as above,

The same things supposed, I guess that the probability of the event M lies somewhere between 0 and the ratio of Ab to AB, my chance to be in the right is the ratio of Abm to AiB.

but clearly set within the Beta(p+1,q+1) distribution [in modern terms]. Bayes then inserts a scholium where he tries to justify the use of the uniform prior, however I do not see the validity of the reasoning since he seems to argue in favour of a uniform distribution on the marginal distribution of the binomial experiment:

I have no reason to think that, in a certain number of trials, it should rather happen any one possible number of times than another.

The last part of the Essay per se is about deriving a closed form formula for the Beta integral, a feat achieved in Rule I. [399]

$P(X\le \theta\le x|p,q) = (p+q){p+q\choose p}\left\{\frac{X^{p+1}}{p+1}-q\frac{X^{p+2}}{p+2}+\cdots\right.$

$\left.\cdots-\frac{x^{p+1}}{p+1}+q\frac{x^{p+2}}{p+2}-\cdots\right\}$

in slightly more modern notations. The 18 remaining pages are written by Richard Price, who first reproduces Bayes’ approximations to the above integral with improvements of his own, then illustrates the performances of such approximations in specific cases, with the astounding fact that the probability covered by the approximation is centred at the MLE:

$P(|\theta-p/(p+q)|\le z)$

and not at the Bayes posterior mean. This could be extrapolated as one of the earliest confidence sets, except of course that the probability is over the parameter space. I note that Price also derives [409-410] as a consequence of Bayes’ calculations what is now know as Laplace’s succession rule…! Besides the derivation of the posterior distribution itself, which must be a considerable feat for the time, the attention to computational issues is highly commendable, as it would become a constant theme of Bayesian studies for centuries!!!