Archive for Jay Kadane

principles of uncertainty (second edition)

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on July 21, 2020 by xi'an

A new edition of Principles of Uncertainty is about to appear. I was asked by CRC Press to review the new book and here are some (raw) extracts from my review. (Some comments may not apply to the final and published version, mind.)

In Chapter 6, the proof of the Central Limit Theorem utilises the “smudge” technique, which is to add an independent noise to both the sequence of rvs and its limit. This is most effective and reminds me of quite a similar proof Jacques Neveu used in its probability notes in Polytechnique. Which went under the more formal denomination of convolution, with the same (commendable) purpose of avoiding Fourier transforms. If anything, I would have favoured a slightly more condensed presentation in less than 8 pages. Is Corollary 6.5.8 useful or even correct??? I do not think so because the non-centred average rescaled by √n diverges almost surely. For the same reason, I object to the very first sentence of Section 6.5 (p.246)

In Chapter 7, I found a nice mention of (Hermann) Rubin’s insistence on not separating probability and utility as only the product matters. And another fascinating quote from Keynes, not from his early statistician’s years, but in 1937 as an established economist

“The sense in which I am using the term uncertain is that in which the prospect of a European war is uncertain, or the price of copper and the rate of interest twenty years hence, or the obsolescence of a new invention, or the position of private wealth-owners in the social system in 1970. About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know. Nevertheless, the necessity for action and for decision compels us as practical men to do our best to overlook this awkward fact and to behave exactly as we should if we had behind us a good Benthamite calculation of a series of prospective advantages and disadvantages, each multiplied by its appropriate probability, waiting to the summed.”

(is the last sentence correct? I would have expected, pardon my French!, “to be summed”). Further interesting trivia on the criticisms of utility theory, including de Finetti’s role and his own lack of connection with subjective probability principles.

In Chapter 8, a major remark (iii) is found p.293 about the fact that a conjugate family requires a dominating measure (although this is expressed differently since the book shies away from introducing measure theory, ) reminds me of a conversation I had with Jay when I visited Carnegie Mellon in 2013 (?). Which exposes the futility of seeing conjugate priors as default priors. It is somewhat surprising that a notion like admissibility appears as a side quantity when discussing Stein’s paradox in 8.2.1 [and then later in Section 9.1.3] while it seems to me to be central to Bayesian decision theory, much more than the epiphenomenon that Stein’s paradox represents in the big picture. But the book dismisses minimaxity even faster in Section 9.1.4:

As many who suffer from paranoia have discovered, one can always dream-up an even worse possibility to guard against. Thus, the minimax framework is unstable. (p.336)

Interesting introduction of the Wishart distribution to kindly handle random matrices and matrix Jacobians, with the original space being the p(p+1)/2 real space (implicitly endowed with the Lebesgue measure). Rather than a more structured matricial space. A font error makes Corollary 8.7.2 abort abruptly. The space of positive definite matrices is mentioned in Section8.7.5 but still (implicitly) corresponds to the common p(p+1)/2 real Euclidean space. Another typo in Theorem 8.9.2 with a Frenchised version of Dirichlet, Dirichelet. Followed by a Dirchlet at the end of the proof (p.322). Again and again on p.324 and on following pages. I would object to the singular in the title of Section 8.10 as there are exponential families rather than a single one. With no mention made of Pitman-Koopman lemma and its consequences, namely that the existence of conjugacy remains an epiphenomenon. Hence making the amount of pages dedicated to gamma, Dirichlet and Wishart distributions somewhat excessive.

In Chapter 9, I noticed (p.334) a Scheffe that should be Scheffé (and again at least on p.444). (I love it that Jay also uses my favorite admissible (non-)estimator, namely the constant value estimator with value 3.) I wonder at the worth of a ten line section like 9.3, when there are delicate issues in handling meta-analysis, even in a Bayesian mood (or mode). In the model uncertainty section, Jay discuss the (im)pertinence of both selecting one of the models and setting independent priors on their respective parameters, with which I disagree on both levels. Although this is followed by a more reasonable (!) perspective on utility. Nice to see a section on causation, although I would have welcomed an insert on the recent and somewhat outrageous stand of Pearl (and MacKenzie) on statisticians missing the point on causation and counterfactuals by miles. Nonparametric Bayes is a new section, inspired from Ghahramani (2005). But while it mentions Gaussian and Dirichlet [invariably misspelled!] processes, I fear it comes short from enticing the reader to truly grasp the meaning of a prior on functions. Besides mentioning it exists, I am unsure of the utility of this section. This is one of the rare instances where measure theory is discussed, only to state this is beyond the scope of the book (p.349).

dominating measure

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , on March 21, 2019 by xi'an

Yet another question on X validated reminded me of a discussion I had once  with Jay Kadane when visiting Carnegie Mellon in Pittsburgh. Namely the fundamentally ill-posed nature of conjugate priors. Indeed, when considering the definition of a conjugate family as being a parameterised family Þ of distributions over the parameter space Θ stable under transform to the posterior distribution, this property is completely dependent (if there is such a notion as completely dependent!) on the dominating measure adopted on the parameter space Θ. Adopted is the word as there is no default, reference, natural, &tc. measure that promotes one specific measure on Θ as being the dominating measure. This is a well-known difficulty that also sticks out in most “objective Bayes” problems, as well as with maximum entropy priors. This means for instance that, while the Gamma distributions constitute a conjugate family for a Poisson likelihood, so do the truncated Gamma distributions. And so do the distributions which density (against a Lebesgue measure over an arbitrary subset of (0,∞)) is the product of a Gamma density by an arbitrary function of θ. I readily acknowledge that the standard conjugate priors as introduced in every Bayesian textbook are standard because they facilitate (to a certain extent) posterior computations. But, just like there exist an infinity of MaxEnt priors associated with an infinity of dominating measures, there exist an infinity of conjugate families, once more associated with an infinity of dominating measures. And the fundamental reason is that the sampling model (which induces the shape of the conjugate family) does not provide a measure on the parameter space Θ.

Practicals of Uncertainty [book review]

Posted in Books, Statistics, University life with tags , , , , , , , on December 22, 2017 by xi'an

On my way to the O’Bayes 2017 conference in Austin, I [paradoxically!] went through Jay Kadane’s Pragmatics of Uncertainty, which had been published earlier this year by CRC Press. The book is to be seen as a practical illustration of the Principles of Uncertainty Jay wrote in 2011 (and I reviewed for CHANCE). The avowed purpose is to allow the reader to check through Jay’s applied work whether or not he had “made good” on setting out clearly the motivations for his subjective Bayesian modelling. (While I presume the use of the same P of U in both books is mostly a coincidence, I started wondering how a third P of U volume could be called. Perils of Uncertainty? Peddlers of Uncertainty? The game is afoot!)

The structure of the book is a collection of fifteen case studies undertaken by Jay over the past 30 years, covering paleontology, survey sampling, legal expertises, physics, climate, and even medieval Norwegian history. Each chapter starts with a short introduction that often explains how he came by the problem (most often as an interesting PhD student consulting project at CMU), what were the difficulties in the analysis, and what became of his co-authors. As noted by the author, the main bulk of each chapter is the reprint (in a unified style) of the paper and most of these papers are actually and freely available on-line. The chapter always concludes with an epilogue (or post-mortem) that re-considers (very briefly) what had been done and what could have been done and whether or not the Bayesian perspective was useful for the problem (unsurprisingly so for the majority of the chapters!). There are also reading suggestions in the other P of U and a few exercises.

“The purpose of the book is philosophical, to address, with specific examples, the question of whether Bayesian statistics is ready for prime time. Can it be used in a variety of applied settings to address real applied problems?”

The book thus comes as a logical complement of the Principles, to demonstrate how Jay himself did apply his Bayesian principles to specific cases and how one can set the construction of a prior, of a loss function or of a statistical model in identifiable parts that can then be criticised or reanalysed. I find browsing through this series of fourteen different problems fascinating and exhilarating, while I admire the dedication of Jay to every case he presents in the book. I also feel that this comes as a perfect complement to the earlier P of U, in that it makes refering to a complete application of a given principle most straightforward, the problem being entirely described, analysed, and in most cases solved within a given chapter. A few chapters have discussions, being published in the Valencia meeting proceedings or another journal with discussions.

While all papers have been reset in the book style, I wish the graphs had been edited as well as they do not always look pretty. Although this would have implied a massive effort, it would have also been great had each chapter and problem been re-analysed or at least discussed by another fellow (?!) Bayesian in order to illustrate the impact of individual modelling sensibilities. This may however be a future project for a graduate class. Assuming all datasets are available, which is unclear from the text.

“We think however that Bayes factors are overemphasized. In the very special case in which there are only two possible “states of the world”, Bayes factors are sufficient. However in the typical case in which there are many possible states of the world, Bayes factors are sufficient only when the decision-maker’s loss has only two values.” (p. 278)

The above is in Jay’s reply to a comment from John Skilling regretting the absence of marginal likelihoods in the chapter. Reply to which I completely subscribe.

[Usual warning: this review should find its way into CHANCE book reviews at some point, with a fairly similar content.]