## approximation of improper by vague priors

Posted in Statistics, University life with tags , , , on November 18, 2013 by xi'an

“…many authors prefer to replace these improper priors by vague priors, i.e. probability measures that aim to represent very few knowledge on the parameter.”

Christèle Bioche and Pierre Druihlet arXived a few days ago a paper with this title. They aim at bringing a new light on the convergence of vague priors to their limit. Their notion of convergence is a pointwise convergence in the quotient space of Radon measures, quotient being defined by the removal of the “normalising” constant. The first results contained in the paper do not show particularly enticing properties of the improper limit of proper measures as the limit cannot be given any (useful) probabilistic interpretation. (A feature already noticeable when reading Jeffreys.) The first result that truly caught my interest in connection with my current research is the fact that the Haar measures appear as a (weak) limit of conjugate priors (Section 2.5). And that the Jeffreys prior is the limit of the parametrisation-free conjugate priors of Druilhet and Pommeret (2012, Bayesian Analysis, a paper I will discuss soon!). The result about the convergence of posterior means is rather anticlimactic as the basis assumption is the uniform integrability of the sequence of the prior densities. An interesting counterexample (somehow familiar to invariance fans): the sequence of Poisson distributions with mean n has no weak limit. And the Haldane prior does appear as a limit of Beta distributions (less surprising). On (0,1) if not on [0,1].

The paper contains a section on the Jeffreys-Lindley paradox, which is only considered from the second perspective, the one I favour. There is however a mention made of the noninformative answer, which is the (meaningless) one associated with the Lebesgue measure of normalising constant one. This Lebesgue measure also appears as a weak limit in the paper, even though the limit of the posterior probabilities is 1. Except when the likelihood has bounded variations outside compacts. Then the  limit of the probabilities is the prior probability of the null… Interesting, truly, but not compelling enough to change my perspective on the topic. (And thanks to the authors for their thanks!)

## optimal direction Gibbs

Posted in Statistics, University life with tags , , , , , , on May 29, 2012 by xi'an

An interesting paper appeared on arXiv today. Entitled On optimal direction gibbs sampling, by Andrés Christen, Colin Fox, Diego Andrés Pérez-Ruiz and Mario Santana-Cibrian, it defines optimality as picking the direction that brings the maximum independence between two successive realisations in the Gibbs sampler. More precisely, it aims at choosing the direction e that minimises the mutual information criterion

$\int\int f_{Y,X}(y,x)\log\dfrac{f_{Y,X}(y,x)}{f_Y(y)f_X(x)}\,\text{d}x\,\text{d}y$

I have a bit of an issue about this choice because it clashes with measure theory. Indeed, in one Gibbs step associated with e the transition kernel is defined in terms of the Lebesgue measure over the line induced by e. Hence the joint density of the pair of successive realisations is defined in terms of the product of the Lebesgue measure on the overall space and of the Lebesgue measure over the line induced by e… While the product in the denominator is defined against the product of the Lebesgue measure on the overall space and itself. The two densities are therefore not comparable since not defined against equivalent measures… The difference between numerator and denominator is actually clearly expressed in the normal example (page 3) when the chain operates over a n dimensional space, but where the conditional distribution of the next realisation is one-dimensional, thus does not relate with the multivariate normal target on the denominator. I therefore do not agree with the derivation of the mutual information henceforth produced as (3).

The above difficulty is indirectly perceived by the authors, who note “we cannot simply choose the best direction: the resulting Gibbs sampler would not be irreducible” (page 5), an objection I had from an earlier page… They instead pick directions at random over the unit sphere and (for the normal case) suggest using a density over those directions such that

$h^*(\mathbf{e})\propto(\mathbf{e}^\prime A\mathbf{e})^{1/2}$

which cannot truly be called “optimal”.

More globally, searching for “optimal” directions (or more generally transforms) is quite a worthwhile idea, esp. when linked with adaptive strategies…

## principles of uncertainty

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , , , on October 14, 2011 by xi'an

Bayes Theorem is a simple consequence of the axioms of probability, and is therefore accepted by all as valid. However, some who challenge the use of personal probability reject certain applications of Bayes Theorem.”  J. Kadane, p.44

Principles of uncertainty by Joseph (“Jay”) Kadane (Carnegie Mellon University, Pittsburgh) is a profound and mesmerising book on the foundations and principles of subjectivist or behaviouristic Bayesian analysis. Jay Kadane wrote Principles of uncertainty over a period of several years and, more or less in his own words, it represents the legacy he wants to leave for the future. The book starts with a large section on Jay’s definition of a probability model, with rigorous mathematical derivations all the way to Lebesgue measure (or more exactly the McShane-Stieltjes measure). This section contains many side derivations that pertain to mathematical analysis, in order to explain the subtleties of infinite countable and uncountable sets, and the distinction between finitely additive and countably additive (probability) measures. Unsurprisingly, the role of utility is emphasized in this book that keeps stressing the personalistic entry to Bayesian statistics. Principles of uncertainty also contains a formal development on the validity of Markov chain Monte Carlo methods that is superb and missing in most equivalent textbooks. Overall, the book is a pleasure to read. And highly recommended for teaching as it can be used at many different levels. Continue reading