Archive for principles of uncertainty

principles of uncertainty (second edition)

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on July 21, 2020 by xi'an

A new edition of Principles of Uncertainty is about to appear. I was asked by CRC Press to review the new book and here are some (raw) extracts from my review. (Some comments may not apply to the final and published version, mind.)

In Chapter 6, the proof of the Central Limit Theorem utilises the “smudge” technique, which is to add an independent noise to both the sequence of rvs and its limit. This is most effective and reminds me of quite a similar proof Jacques Neveu used in its probability notes in Polytechnique. Which went under the more formal denomination of convolution, with the same (commendable) purpose of avoiding Fourier transforms. If anything, I would have favoured a slightly more condensed presentation in less than 8 pages. Is Corollary 6.5.8 useful or even correct??? I do not think so because the non-centred average rescaled by √n diverges almost surely. For the same reason, I object to the very first sentence of Section 6.5 (p.246)

In Chapter 7, I found a nice mention of (Hermann) Rubin’s insistence on not separating probability and utility as only the product matters. And another fascinating quote from Keynes, not from his early statistician’s years, but in 1937 as an established economist

“The sense in which I am using the term uncertain is that in which the prospect of a European war is uncertain, or the price of copper and the rate of interest twenty years hence, or the obsolescence of a new invention, or the position of private wealth-owners in the social system in 1970. About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know. Nevertheless, the necessity for action and for decision compels us as practical men to do our best to overlook this awkward fact and to behave exactly as we should if we had behind us a good Benthamite calculation of a series of prospective advantages and disadvantages, each multiplied by its appropriate probability, waiting to the summed.”

(is the last sentence correct? I would have expected, pardon my French!, “to be summed”). Further interesting trivia on the criticisms of utility theory, including de Finetti’s role and his own lack of connection with subjective probability principles.

In Chapter 8, a major remark (iii) is found p.293 about the fact that a conjugate family requires a dominating measure (although this is expressed differently since the book shies away from introducing measure theory, ) reminds me of a conversation I had with Jay when I visited Carnegie Mellon in 2013 (?). Which exposes the futility of seeing conjugate priors as default priors. It is somewhat surprising that a notion like admissibility appears as a side quantity when discussing Stein’s paradox in 8.2.1 [and then later in Section 9.1.3] while it seems to me to be central to Bayesian decision theory, much more than the epiphenomenon that Stein’s paradox represents in the big picture. But the book dismisses minimaxity even faster in Section 9.1.4:

As many who suffer from paranoia have discovered, one can always dream-up an even worse possibility to guard against. Thus, the minimax framework is unstable. (p.336)

Interesting introduction of the Wishart distribution to kindly handle random matrices and matrix Jacobians, with the original space being the p(p+1)/2 real space (implicitly endowed with the Lebesgue measure). Rather than a more structured matricial space. A font error makes Corollary 8.7.2 abort abruptly. The space of positive definite matrices is mentioned in Section8.7.5 but still (implicitly) corresponds to the common p(p+1)/2 real Euclidean space. Another typo in Theorem 8.9.2 with a Frenchised version of Dirichlet, Dirichelet. Followed by a Dirchlet at the end of the proof (p.322). Again and again on p.324 and on following pages. I would object to the singular in the title of Section 8.10 as there are exponential families rather than a single one. With no mention made of Pitman-Koopman lemma and its consequences, namely that the existence of conjugacy remains an epiphenomenon. Hence making the amount of pages dedicated to gamma, Dirichlet and Wishart distributions somewhat excessive.

In Chapter 9, I noticed (p.334) a Scheffe that should be Scheffé (and again at least on p.444). (I love it that Jay also uses my favorite admissible (non-)estimator, namely the constant value estimator with value 3.) I wonder at the worth of a ten line section like 9.3, when there are delicate issues in handling meta-analysis, even in a Bayesian mood (or mode). In the model uncertainty section, Jay discuss the (im)pertinence of both selecting one of the models and setting independent priors on their respective parameters, with which I disagree on both levels. Although this is followed by a more reasonable (!) perspective on utility. Nice to see a section on causation, although I would have welcomed an insert on the recent and somewhat outrageous stand of Pearl (and MacKenzie) on statisticians missing the point on causation and counterfactuals by miles. Nonparametric Bayes is a new section, inspired from Ghahramani (2005). But while it mentions Gaussian and Dirichlet [invariably misspelled!] processes, I fear it comes short from enticing the reader to truly grasp the meaning of a prior on functions. Besides mentioning it exists, I am unsure of the utility of this section. This is one of the rare instances where measure theory is discussed, only to state this is beyond the scope of the book (p.349).

Practicals of Uncertainty [book review]

Posted in Books, Statistics, University life with tags , , , , , , , on December 22, 2017 by xi'an

On my way to the O’Bayes 2017 conference in Austin, I [paradoxically!] went through Jay Kadane’s Pragmatics of Uncertainty, which had been published earlier this year by CRC Press. The book is to be seen as a practical illustration of the Principles of Uncertainty Jay wrote in 2011 (and I reviewed for CHANCE). The avowed purpose is to allow the reader to check through Jay’s applied work whether or not he had “made good” on setting out clearly the motivations for his subjective Bayesian modelling. (While I presume the use of the same P of U in both books is mostly a coincidence, I started wondering how a third P of U volume could be called. Perils of Uncertainty? Peddlers of Uncertainty? The game is afoot!)

The structure of the book is a collection of fifteen case studies undertaken by Jay over the past 30 years, covering paleontology, survey sampling, legal expertises, physics, climate, and even medieval Norwegian history. Each chapter starts with a short introduction that often explains how he came by the problem (most often as an interesting PhD student consulting project at CMU), what were the difficulties in the analysis, and what became of his co-authors. As noted by the author, the main bulk of each chapter is the reprint (in a unified style) of the paper and most of these papers are actually and freely available on-line. The chapter always concludes with an epilogue (or post-mortem) that re-considers (very briefly) what had been done and what could have been done and whether or not the Bayesian perspective was useful for the problem (unsurprisingly so for the majority of the chapters!). There are also reading suggestions in the other P of U and a few exercises.

“The purpose of the book is philosophical, to address, with specific examples, the question of whether Bayesian statistics is ready for prime time. Can it be used in a variety of applied settings to address real applied problems?”

The book thus comes as a logical complement of the Principles, to demonstrate how Jay himself did apply his Bayesian principles to specific cases and how one can set the construction of a prior, of a loss function or of a statistical model in identifiable parts that can then be criticised or reanalysed. I find browsing through this series of fourteen different problems fascinating and exhilarating, while I admire the dedication of Jay to every case he presents in the book. I also feel that this comes as a perfect complement to the earlier P of U, in that it makes refering to a complete application of a given principle most straightforward, the problem being entirely described, analysed, and in most cases solved within a given chapter. A few chapters have discussions, being published in the Valencia meeting proceedings or another journal with discussions.

While all papers have been reset in the book style, I wish the graphs had been edited as well as they do not always look pretty. Although this would have implied a massive effort, it would have also been great had each chapter and problem been re-analysed or at least discussed by another fellow (?!) Bayesian in order to illustrate the impact of individual modelling sensibilities. This may however be a future project for a graduate class. Assuming all datasets are available, which is unclear from the text.

“We think however that Bayes factors are overemphasized. In the very special case in which there are only two possible “states of the world”, Bayes factors are sufficient. However in the typical case in which there are many possible states of the world, Bayes factors are sufficient only when the decision-maker’s loss has only two values.” (p. 278)

The above is in Jay’s reply to a comment from John Skilling regretting the absence of marginal likelihoods in the chapter. Reply to which I completely subscribe.

[Usual warning: this review should find its way into CHANCE book reviews at some point, with a fairly similar content.]

Bayesian program synthesis

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on April 7, 2017 by xi'an

Last week, I—along with Jean-Michel Marin—got an email from a journalist working for Science & Vie, a French sciences journal that published a few years ago a special issue on Bayes’ theorem. (With the insane title of “the formula that deciphers the World!”) The reason for this call was the preparation of a paper on Gamalon, a new AI company that relies on (Bayesian) probabilistic programming to devise predictive tools. And spent an hour skyping with him about Bayesian inference, probabilistic programming and machine-learning, at the general level since we had not heard previously of this company or of its central tool.

“the Gamalon BPS system learns from only a few examples, not millions. It can learn using a tablet processor, not hundreds of servers. It learns right away while we play with it, not over weeks or months. And it learns from just one person, not from thousands.”

Gamalon claims to do much better than deep learning at those tasks. Not that I have reasons to doubt that claim, quite the opposite, an obvious reason being that incorporating rules and probabilistic models in the predictor is going to help if these rule and models are even moderately realistic, another major one being that handling uncertainty and learning by Bayesian tools is usually a good idea (!), and yet another significant one being that David Blei is a member of their advisory committee. But it is hard to get a feeling for such claims when the only element in the open is the use of probabilistic programming, which is an advanced and efficient manner of conducting model building and updating and handling (posterior) distributions as objects, but which does not enjoy higher predictives abilities by default. Unless I live with a restricted definition of what probabilistic programming stands for! In any case, the video provided by Gamalon and the presentation given by its CEO do not help in my understanding of the principles behind this massive gain in efficiency. Which makes sense given that the company would not want to give up their edge on the competition.

Incidentally, the video in this presentation comparing the predictive abilities of the four major astronomical explanations of the solar system is great. If not particularly connected with the difference between deep learning and Bayesian probabilistic programming.

Amazon associates links

Posted in Books, pictures with tags , , , , on December 3, 2011 by xi'an

Following a now established tradition, I give here my yearly warning that the links to Amazon.com and Amazon.fr on this blog are actually susceptible to earn me a monetary gain [of 4% to 7%] if a purchase is made in the 24 hours following the entry on Amazon through this link, thanks to the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to amazon.com/fr. As with last year, some of the items purchased through the links and contributing to my bookoholic addiction (and indirectly to the above picture) are rather unrelated with the purpose of the ‘Og, but then, anything can happen within 24 hours! Apart from a purchase I cannot decently mention here (!), here are the weirdest ones:

plus of course many more purchases of books I actually reviewed along the past months… Like six copies of Principles of uncertainty. And a dozen of the theory that would not die.