Archive for Alfréd Rényi

statistics with improper posteriors [or not]

Posted in Statistics with tags , , , , , , on March 6, 2019 by xi'an

Last December, Gunnar Taraldsen, Jarle Tufto, and Bo H. Lindqvist arXived a paper on using priors that lead to improper posteriors and [trying to] getting away with it! The central concept in their approach is Rényi’s generalisation of Kolmogorov’s version to define conditional probability distributions from infinite mass measures by conditioning on finite mass measurable sets. A position adopted by Dennis Lindley in his 1964 book .And already discussed in a few ‘Og’s posts. While the theory thus developed indeed allows for the manipulation of improper posteriors, I have difficulties with the inferential aspects of the construct, since one cannot condition on an arbitrary finite measurable set without prior information. Things get a wee bit more outwardly when considering “data” with infinite mass, in Section 4.2, since they cannot be properly normalised (although I find the example of the degenerate multivariate Gaussian distribution puzzling as it is not a matter of improperness, since the degenerate Gaussian has a well-defined density against the right dominating measure).  The paper also discusses marginalisation paradoxes, by acknowledging that marginalisation is no longer feasible with improper quantities. And the Jeffreys-Lindley paradox, with a resolution that uses the sum of the Dirac mass at the null, δ⁰, and of the Lebesgue measure on the real line, λ, as the dominating measure. This indeed solves the issue of the arbitrary constant in the Bayes factor, since it is “the same” on the null hypothesis and elsewhere, but I do not buy the argument, as I see no reason to favour δ⁰+λ over 3.141516 δ⁰+λ or δ⁰+1.61718 λ… (This section 4.5 also illustrates that the choice of the sequence of conditioning sets has an impact on the limiting measure, in the Rényi sense.) In conclusion, after reading the paper, I remain uncertain as to how to exploit this generalisation from an inferential (Bayesian?) viewpoint, since improper posteriors do not clearly lead to well-defined inferential procedures…

foundations of probability

Posted in Books, Statistics with tags , , , , on December 1, 2017 by xi'an

Following my reading of a note by Gunnar Taraldsen and co-authors on improper priors, I checked the 1970 book of Rényi from the Library at Warwick. (First time I visited this library, where I got very efficient help in finding and borrowing this book!)

“…estimates of probability of an event made by different persons may be different and each such estimate is to a certain extent subjective.” (p.33)

The main argument from Rényi used by the above mentioned note (and an earlier paper in The American Statistician) is that “every probability is in reality a conditional probability” (p.34). Which may be a pleonasm as everything depends on the settings in which it is applied. And as such not particularly new since conditioning is also present in e.g. Jeffreys’ book. In this approach, the definition of the conditional probability is traditional, if restricted to condition on a subset of elements from the σ algebra. The interesting part in the book is rather that a measure on this subset can be derived from the conditionals. And extended to the whole σ algebra. And is unique up to a multiplicative constant. Interesting because this indeed produces a rigorous way of handling improper priors.

“Let the random point (ξ,η) be uniformly distributed over the whole (x,y) plane.” (p.83)

Rényi also defines random variables ξ on conditional probability spaces, with conditional densities. With constraints on ξ for those to exist. I have more difficulties to ingest this notion as I do not see the meaning of the above quote or of the quantity

P(a<ξ<b|c<ξ<d)

when P(a<ξ<b) is not defined. As for instance I see no way of generating such a ξ in this case. (Of course, it is always possible to bring in a new definition of random variables that only agrees with regular ones for finite measure.)

a new paradigm for improper priors

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , on November 6, 2017 by xi'an

Gunnar Taraldsen and co-authors have arXived a short note on using improper priors from a new perspective. Generalising an earlier 2016 paper in JSPI on the same topic. Which both relate to a concept introduced by Rényi (who himself attributes the idea to Kolmogorov). Namely that random variables measures are to be associated with arbitrary measures [not necessarily σ-finite measures, the later defining σ-finite random variables], rather than those with total mass one. Which allows for an alternate notion of conditional probability in the case of σ-finite random variables, with the perk that this conditional probability distribution is itself of mass 1 (a.e.).  Which we know happens when moving from prior to proper posterior.

I remain puzzled by the 2016 paper though as I do not follow the meaning of a random variable associated with an infinite mass probability measure. If the point is limited to construct posterior probability distributions associated with improper priors, there is little value in doing so. The argument in the 2016 paper is however that one can then define a conditional distribution in marginalisation paradoxes à la Stone, Dawid and Zidek (1973) where the marginal does not exist. Solving with this formalism the said marginalisation paradoxes as conditional distributions are only defined for σ-finite random variables. Which gives a fairly different conclusion from either Stone, Dawid and Zidek (1973) [with whom I agree, namely that there is no paradox because there is no “joint” distribution] or Jaynes (1973) [with whom I less agree!, in that the use of an invariant measure to make the discrepancy go away is not a particularly strong argument in favour of this measure]. The 2016 paper also draws an interesting connection with the study by Jim Hobert and George Casella (in Jim’s thesis) of [null recurrent or transient] Gibbs samplers with no joint [proper] distribution. Which in some situations can produce proper subchains, a phenomenon later exhibited by Alan Gelfand and Sujit Sahu (and Xiao-Li Meng as well if I correctly remember!). But I see no advantage in following this formalism, as it does not impact whether the chain is transient or null recurrent, or anything connected with its implementation. Plus a link to the approximation of improper priors by sequences of proper ones by Bioche and Druihlet I discussed a while ago.