Archive for The Search for Certainty

informative hypotheses (book review)

Posted in Books, R, Statistics with tags , , , , , , on September 19, 2013 by xi'an

The title of this book Informative Hypotheses somehow put me off from the start: the author, Hebert Hoijtink, seems to distinguish between informative and uninformative (deformative? disinformative?) hypotheses. Namely, something like

H0: μ1234

is “very informative” and unrealistic, and the alternative Ha is completely uninformative, while the “alternative null”

H1: μ1<μ23<μ4

is informative. (Hence the < signs on the cover. One of my book reviews idiosyncrasies is to find hidden meaning behind the cover design…) The idea is thus to have the researcher give some input in the construction of the null hypothesis (as if hypothesis tests usually were not about questions that mattered….).

In fact, this distinction put me off so much that I only ended up reading chapters 1 (an introduction), 3 (an introduction [to the Bayesian processing of such hypotheses]) and 10 (on Bayesian foundations of testing informative hypotheses). Hence a very biased review of Informative Hypotheses that follows….

Given an existing (but out of print?) reference like Robertson, Wright and Dykjstra (1988), that I particularly enjoyed when working on isotonic regression in the mid 90’s, I do not see much of an added value in the present book. The important references are mostly centred on works by the author and his co-authors or students (often Unpublished or In Press), which gives me the impression the book was hurriedly gathered from those papers.

“The Bayes factor (…) is default, objective, based on an appropriate quantification of complexity.” (p.197)

The first chapter of Informative Hypotheses is a motivation for the study of those informative hypotheses, with a focus on ANOVA models. There is not much in the chapter that explains what is so special about those ordering (null) hypotheses and why a whole book is required to cover their processing. A noteworthy specificity of the approach, nonetheless, is that point null hypotheses seem to be replaced with “about equality constraints” (p.9), |μ23|<d, where d is specified by the researcher as significant. This chapter also gives illustrations of ordered (or informative) hypotheses in the settings of analysis of covariance (ANCOVA) and regression models, but does not indicate (yet) how to run the tests. The concluding section is about the epistemological focus of the book, quoting Popper, Sober and Carnap, although I do not see much of a support in those quotes.

“Objective means that Bayes factors based on this prior distribution are essentially independent of this prior distribution.” (p.53)

Chapter 3 starts the introduction to Bayesian statistics with the strange idea of calling the likelihood the “density of the data”. It is indeed the probability density of the model evaluated at the data but… it conveys a confusing meaning since it is not a density when plotted against the parameters (as in Figure 1, p. 44, where, incidentally the exact probability model is not specified). The prior distribution is defined as a normal x inverse chi-square distribution on the vector of the means (in the ANOVA model) and the common variance. Due to the classification of the variance as a nuisance parameter, the author can get away with putting an improper prior on this parameter (p.46). The normal prior is chosen to be “neutral”, i.e. to give the same prior weight to the null and the alternative hypotheses. This seems logical at some initial level, but constructing such a prior for convoluted hypotheses may simply be impossible… Because the null hypothesis has a positive mass (maybe .5) under the “unconstrained prior” (p.48), the author can also get away with projecting this prior onto the constrained space of the null hypothesis. Even when setting the prior variance to oo (p.50). The Bayes factor is then the ratio of the (posterior and prior) normalising constants over the constrained parameter space. The book still mentions the Lindley-Bartlett paradox (p.60) in the case of the about equality hypotheses. The appendix to this chapter mentions the issue of improper priors and the need for accommodating infinite mass with training samples, providing a minimum training sample solution using mixtures that sound fairly ad hoc to me.

“Bayes factors for the evaluation of informative hypotheses have a simple form.” (p. 193)

Chapter 10 is the final chapter of Informative Hypotheses, on “Foundations of Bayesian evaluation of informative hypotheses”, and I was expecting a more in-depth analysis of those special hypotheses, but it is mostly a repetition of what is found in Chapter 3, the wider generality being never exploited to a useful depth. There is also this gem quoted above  that, because Bayes factors are the ratio of two (normalising) constants, fm/cm, they have a “simple form”. The reference to Carlin and Chib (1995) for computing other cases then sounds pretty obscure. (Another tiny gem is that I spotted the R software contingency spelled with three different spellings.)  The book mentions the Savage-Dickey representation of the Bayes factor, but I could not spot the connection from the few lines (p.193) dedicated to this ratio. More generally, I do not find the generality of this chapter particularly convincing, most of it replicating the notions found in Chapter 3., like the use of posterior priors. The numerical approximation of Bayes factors is proposed via simulation from the unconstrained prior and posterior (p.207) then via a stepwise decomposition of the Bayes factor (p.208) and a Gibbs sampler that relies on inverse cdf sampling.

Overall, I feel that this book came out too early, without a proper basis and dissemination of the ideas of the author: to wit, a large number of references are connected to the author, some In Press, other Unpublished (which leads to a rather abstract “see Hoijtink (Unpublished) for a related theorem” (p.195)). From my incomplete reading, I did not gather a sense of novel perspective but rather of a topic that seemed too narrow for a whole book.

from Jakob Bernoulli to Hong Kong

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , , on August 24, 2013 by xi'an

Here are my slides (or at least the current version thereof) for my talk in Hong Kong at the 2013 (59th ISI) World Statistical Congress(I stopped embedding my slideshare links in the posts as they freeze my broswer. I wonder if anyone else experiences the same behaviour.)

This talk will feature in the History I: Jacob Bernoulli’s “Ars Conjectandi” and the emergence of probability invited paper session organised by Adam Jakubowski. While my own research connection with Bernoulli is at most tenuous, besides using the Law of Large Numbers and Bernoulli rv’s…,  I [of course!] borrowed from earlier slides on our vanilla Rao-Blackwellisation paper (if only  because of the Bernoulli factory connection!) and ask Mark Girolami for his Warwick slides on the Russian roulette (another Bernoulli factory connection!), before recycling my Budapest slides on ABC. The other talks in the session are by Edith Dudley Sylla on Ars Conjectandi and by Krzys Burdzy on his book The Search for Certainty. Book that I critically reviewed in Bayesian Analysis. This will be the first time I meet Krzys in person and I am looking forward to the opportunity!

JSM [4]

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , on August 3, 2011 by xi'an

A new day at JSM 2011, admittedly not as tense as Monday, but still full. After a long run in the early hours when I took this picture, I started the day with the Controversies in the philosophy of Bayesian statistics with Jim Berger and Andrew Gelman, Rob Kass and Cosma Shalizi being unable to make it. From my point of view it was a fun session, even though I wish I had been more incisive! But I agreed with most of Jim said, so… It is too bad we could not cover his last point about the Bayesian procedures that were not Bayesianly justified (like posterior predictives) as I was quite interested in the potential discussion in this matter (incl. the position of the room on ABC!). Anyway, I am quite thankful to Andrew for setting up this session.As Jum said, we should have those more often, especially when the attendance was large enough to fill a double room at 8:30am.

Incidentally, I managed to have a glaring typo in my slides, pointed out by Susie Bayarri: Bayes theorem was written as

\pi(\theta) \propto \pi(\theta) f(x|\theta)

Aie, aie, aie! Short of better scapegoats, I will blame the AF plane for this… (This was a good way to start a controversy, however no one raised to the bait!) A more serious question reminded me of the debate surrounding A Search for Certainty: It was whether frequentist and subjective Bayes approaches had more justifications than the objective Bayes approach, in the light of von Mises‘ and personalistic (read, de Finetti) interpretations of probability.

While there were many possible alternatives for the next session, I went to attend Sylvia Richardson’s Medallion Lecture. This made sense on many levels, the primary one being that Sylvia and I worked and are working on rather close topics, from mixtures of distributions, to variable selection, to ABC. So I was looking forward the global picture she would provide on those topics. I particularly enjoyed the way she linked mixtures with more general modelling structures, through extensions in the distribution of the latent variables. (This is also why I am attending Chris Holmes’ Memorial Lecture tomorrow, with the exciting title of Loss, Actions, Decisions: Bayesian Analysis in High-Throughput Genomics.)

In the afternoon, I only attended one talk by David Nott, Efficient MCMC Schemes for Computationally Expensive Posterior Distribution, which involved hybrid Monte Carlo on complex likelihoods. This was quite interesting, as hybrid Monte Carlo is indeed the solution to diminish the number of likelihood evaluations, since it moves along iso-density slices… After this, we went working on ABC model choice with Jean-Michel Marin and Natesh Pillai. Before joining the fun at the Section for Bayesian statistical mixer, where the Savage and Mitchell and student awards were presented. This was the opportunity to see friends, meet new Bayesians, and congratulate the winners, including Julien Cornebise and Robin Ryder of course.

von Mises lecture im Berlin

Posted in Statistics, Travel, University life with tags , , , , on June 3, 2011 by xi'an

In about a month I will give a talk in Berlin on ABC. This is actually a special lecture held annually in honour of Richard von Mises who was professor in Berlin till 1933 when he had to flee Germany. Previous speakers include James A. Sethian, Albert Shiryaev, Uwe Küchler, Enrique Zuazua, and Philip Protter who gave the first von Mises lecture in 2007. I am thus quite honoured to be invited to deliver this lecture as a statistician, even though I fear my lecture and my research are fairly disjoint from Richard von Mises’ contributions to the field… (The closest I came to his work was when reviewing Krzysztof Burdzy’s The Search for Certainty own criticism of von Mises’ [and de Finetti’s] approaches to the definition of probability, only to discover von Mises had not made a lasting impact on the field of statistics in this very specific respect… However, Professor Shirayev’s talk relates to von Mises’s infinite random sequences in connection with both the formalisation of probability and algorithmic theory.)


On randomness

Posted in Books, pictures, Statistics, University life with tags , , , , , on February 6, 2011 by xi'an

A while ago, I posted how strangely people seem to be attracted by re- and re-explaining Bayes’ theorem when I see it as a tautological consequence of the definition of conditional probability (and hence of limited interest per se, although with immense consequences for conducting inference). Through the “spam” book mentioned earlier this week, I noticed that the same (or even worse) fatal attraction holds for randomness! (Although I had already posted on the “truly random” generators…) Having access only to one chapter, I read with a sense of growing puzzlement through Tommaso Toffoli’s chapter and came with the following comments, which are nothing but a Saturday afternoon idle thoughts!

Measure theory, and much of the axiomatic apparatus that goes into what is often called the “foundations” of probability, is just about developing more refined accounting techniques for when the outcome space becomes so large (viz., uncountably infinite) that simple minded techniques lead to paradoxes: “If a line consists of points, and a point has no length, how come a line has length?”

Continue reading