## the maths of Jeffreys-Lindley paradox

**C**ristiano Villa and Stephen Walker arXived on last Friday a paper entitled On the mathematics of the Jeffreys-Lindley paradox. Following the philosophical papers of last year, by Ari Spanos, Jan Sprenger, Guillaume Rochefort-Maranda, and myself, this provides a more statistical view on the paradox. Or “paradox”… Even though I strongly disagree with the conclusion, namely that a finite (prior) variance σ² should be used in the Gaussian prior. And fall back on classical Type I and Type II errors. So, in that sense, the authors avoid the Jeffreys-Lindley paradox altogether!

The argument against considering a limiting value for the posterior probability is that it converges to 0, 21, or an intermediate value. In the first two cases it is useless. In the medium case. achieved when the prior probability of the null and alternative hypotheses depend on variance σ². While I do not want to argue in favour of my 1993 solution

since it is ill-defined in measure theoretic terms, I do not buy the coherence argument that, since this prior probability converges to zero when σ² goes to infinity, the posterior probability should also go to zero. In the limit, probabilistic reasoning fails since the prior under the alternative is a measure not a probability distribution… We should thus abstain from over-interpreting improper priors. (A sin sometimes committed by Jeffreys himself in his book!)

March 26, 2015 at 4:17 pm

I just scanned up to example 1, but isn’t this just an ordinary bayesian “paradox” (ie a thing that’s obviously user error rather than something surprising).

IF it’s what I think it is, then That example is overly contrived if it is: all you need is a likelihood f(theta) = O(exp(x^{2+epsilon})) and put a normal prior on theta. It has nothing to do with continuous models, repeated sampling or measure zero events. It’s just checking your tails. (The bayesian equivalent of checking that the “maximum” of your likelihood isn’t a minimum)

Or am I missing something.

March 26, 2015 at 12:34 am

Proper priors prevent p____-poor performance.

March 26, 2015 at 8:32 am

There are also examples where proper priors produce “poor” performance, namely, improper posteriors (e.g. for point observations).

March 26, 2015 at 11:28 am

Proper priors cannot produce improper posteriors!

March 26, 2015 at 12:26 pm

They actually can produce improper posteriors if you use continuous models with samples that contain repeated observations (I know the argument “that’s a zero-probability event”, but any sample has probability zero under a continuous model and we still use them). If I properly recall, you wrote a comment on a paper that presents such examples:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.377.5786&rep=rep1&type=pdf

The idea behind this issue is that continuous models are approximations, but, as with any approximations, there are conditions behind them.

March 26, 2015 at 12:44 pm

Ah! This is an old debate I have had with Mark Steel about one of his Valencia meeting papers. The measure zero argument

does not applyin this case: if you observe x=3.141396… say, we do have P(X=3.1413962…)=0 buta prioriyou cannot exclude this specific value as a realisation of X. While a prior I can exclude that X will take the value x=0 or the value x=π… Hence, you can exclude from the beginning/a priori, i.e., before looking at the data that X_{1}=X_{2}is impossible. If repeated values occur in your sample, your probability model should account for this possibility by having point masses along the diagonal. Otherwise, your model is inadequate.March 26, 2015 at 12:53 pm

I agree with your points but, as discussed in the paper, the reason for the impropriety of the posterior is that the likelihood function is unbounded (even with proper priors). There are many examples, without involving the presence of repeated observations, where the likelihood is unbounded and one has to be careful in these cases as well, I think. William Meeker recently published an interesting paper in The American Statistician with examples of continuous models with unbounded likelihoods.

March 26, 2015 at 1:21 pm

ThanX, Javier, I will take a look at this paper. However, unbounded likelihoods cannot cause impropriety of the posterior if the prior is proper. To be continued…

March 26, 2015 at 1:24 pm

Not always, I agree, only sometimes :). Thanks for the discussion.

March 26, 2015 at 4:53 pm

There is more to it than just saying that the likelihood is unbounded. It is true that if the likelihood is in L^\infty, then any absolutely continuous proper prior will lead to a proper posterior. Then things get a bit more subtle.

If the likelihood is L^1 (i.e. not essentially bounded but integrable), then any bounded absolutely continuous prior is ok.

In between these, you can have unbounded priors and unbounded likelihoods as long as they’re unboundedness is complementary (i.e. L^p likelihood and L^q prior, where 1/p+1/q=1).

If you want to have atomic priors, then you need your likelihood to be very very nice. (i.e. have, for a d-dimensional state, more than d L^1 derivatives).

The tl;dr version: the more weird your likelihood, the nicer the prior needs to be. (This only relates to boundedness in very limited ways)