## look, look, confidence! [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , on April 23, 2018 by xi'an

As it happens, I recently bought [with Amazon Associate earnings] a (used) copy of Confidence, Likelihood, Probability (Statistical Inference with Confidence Distributions), by Tore Schweder and Nils Hjort, to try to understand this confusing notion of confidence distributions. (And hence did not get the book from CUP or anyone else towards purposely writing a review. Or a ½-review like the one below.)

“Fisher squared the circle and obtained a posterior without a prior.” (p.419)

Now that I have gone through a few chapters, I am no less confused about the point of this notion. Which seems to rely on the availability of confidence intervals. Exact or asymptotic ones. The authors plainly recognise (p.61) that a confidence distribution is neither a posterior distribution nor a fiducial distribution, hence cutting off any possible Bayesian usage of the approach. Which seems right in that there is no coherence behind the construct, meaning for instance there is no joint distribution corresponding to the resulting marginals. Or even a specific dominating measure in the parameter space. (Always go looking for the dominating measure!) As usual with frequentist procedures, there is always a feeling of arbitrariness in the resolution, as for instance in the Neyman-Scott problem (p.112) where the profile likelihood and the deviance do not work, but considering directly the distribution of the (inconsistent) MLE of the variance “saves the day”, which sounds a bit like starting from the solution. Another statistical freak, the Fieller-Creasy problem (p.116) remains a freak in this context as it does not seem to allow for a confidence distribution. I also notice an ambivalence in the discourse of the authors of this book, namely that while they claim confidence distributions are both outside a probabilisation of the parameter and inside, “producing distributions for parameters of interest given the data (…) with fewer philosophical and interpretational obstacles” (p.428).

“Bias is particularly difficult to discuss for Bayesian methods, and seems not to be a worry for most Bayesian statisticians.” (p.10)

The discussions as to whether or not confidence distributions form a synthesis of Bayesianism and frequentism always fall short from being convincing, the choice of (or the dependence on) a prior distribution appearing to the authors as a failure of the former approach. Or unnecessarily complicated when there are nuisance parameters. Apparently missing on the (high) degree of subjectivity involved in creating the confidence procedures. Chapter 1 contains a section on “Why not go Bayesian?” that starts from Chris Sims‘ Nobel Lecture on the appeal of Bayesian methods and goes [softly] rampaging through each item. One point (3) is recurrent in many criticisms of B and I always wonder whether or not it is tongue-in-cheek-y… Namely the fact that parameters of a model are rarely if ever stochastic. This is a misrepresentation of the use of prior and posterior distributions [which are in fact] as summaries of information cum uncertainty. About a true fixed parameter. Refusing as does the book to endow posteriors with an epistemic meaning (except for “Bayesian of the Lindley breed” (p.419) is thus most curious. (The debate is repeating in the final(e) chapter as “why the world need not be Bayesian after all”.)

“To obtain frequentist unbiasedness, the Bayesian will have to choose her prior with unbiasedness in mind. Is she then a Bayesian?” (p.430)

A general puzzling feature of the book is that notions are not always immediately defined, but rather discussed and illustrated first. As for instance for the central notion of fiducial probability (Section 1.7, then Chapter 6), maybe because Fisher himself did not have a general principle to advance. The construction of a confidence distribution most often keeps a measure of mystery (and arbitrariness), outside the rather stylised setting of exponential families and sufficient (conditionally so) statistics. (Incidentally, our 2012 ABC survey is [kindly] quoted in relation with approximate sufficiency (p.180), while it does not sound particularly related to this part of the book. Now, is there an ABC version of confidence distributions? Or an ABC derivation?) This is not to imply that the book is uninteresting!, as I found reading it quite entertaining, with many humorous and tongue-in-cheek remarks, like “From Fraser (1961a) and until Fraser (2011), and hopefully even further” (p.92), and great datasets. (Including one entitled Pornoscope, which is about drosophilia mating.) And also datasets with lesser greatness, like the 3000 mink whales that were killed for Example 8.5, where the authors if not the whales “are saved by a large and informative dataset”… (Whaling is a recurrent [national?] theme throughout the book, along with sport statistics usually involving Norway!)

Miscellanea: The interest of the authors in the topic is credited to bowhead whales, more precisely to Adrian Raftery’s geometric merging (or melding) of two priors and to the resulting Borel paradox (xiii). Proposal that I remember Adrian presenting in Luminy, presumably in 1994. Or maybe in Aussois the year after. The book also repeats Don Fraser’s notion that the likelihood is a sufficient statistic, a point that still bothers me. (On the side, I realised while reading Confidence, &tc., that ABC cannot comply with the likelihood principle.) To end up on a French nitpicking note (!), Quenouille is typ(o)ed Quenoille in the main text, the references and the index. (Blame the .bib file!)

## covariant priors, Jeffreys and paradoxes

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on February 9, 2016 by xi'an

“If no information is available, π(α|M) must not deliver information about α.”

In a recent arXival apparently submitted to Bayesian Analysis, Giovanni Mana and Carlo Palmisano discuss of the choice of priors in metrology. Which reminded me of this meeting I attended at the Bureau des Poids et Mesures in Sèvres where similar debates took place, albeit being led by ferocious anti-Bayesians! Their reference prior appears to be the Jeffreys prior, because of its reparameterisation invariance.

“The relevance of the Jeffreys rule in metrology and in expressing uncertainties in measurements resides in the metric invariance.”

This, along with a second order approximation to the Kullback-Leibler divergence, is indeed one reason for advocating the use of a Jeffreys prior. I at first found it surprising that the (usually improper) prior is used in a marginal likelihood, as it cannot be normalised. A source of much debate [and of our alternative proposal].

“To make a meaningful posterior distribution and uncertainty assessment, the prior density must be covariant; that is, the prior distributions of different parameterizations must be obtained by transformations of variables. Furthermore, it is necessary that the prior densities are proper.”

The above quote is quite interesting both in that the notion of covariant is used rather than invariant or equivariant. And in that properness is indicated as a requirement. (Even more surprising is the noun associated with covariant, since it clashes with the usual notion of covariance!) They conclude that the marginal associated with an improper prior is null because the normalising constant of the prior is infinite.

“…the posterior probability of a selected model must not be null; therefore, improper priors are not allowed.”