## another view on Jeffreys-Lindley paradox

**I** found another paper on the Jeffreys-Lindley paradox. Entitled “A Misleading Intuition and the Bayesian Blind Spot: Revisiting the Jeffreys-Lindley’s Paradox”. Written by Guillaume Rochefort-Maranda, from Université Laval, Québec.

This paper starts by assuming an *unbiased* estimator of the parameter of interest θ and under test for the null θ=θ_{0}. (Which makes we wonder at the reason for imposing unbiasedness.) Another highly innovative (or puzzling) aspect is that the Lindley-Jeffreys paradox presented therein is described *without* *any* *Bayesian input*. The paper stands “within a frequentist (classical) framework”: it actually starts with a confidence-interval-on-θ-vs.-test argument to argue that, with a fixed coverage interval that excludes the null value θ_{0}, the estimate of θ may converge to θ_{0} without ever accepting the null θ=θ_{0}. That is, without the confidence interval *ever* containing θ_{0}. (Although this is an event whose probability converges to zero.) Bayesian aspects come later in the paper, even though the application to a point null *versus* a point null test is of little interest since a Bayes factor is then a likelihood ratio.

As I explained several times, including in my *Philosophy of Science* paper, I see the Lindley-Jeffreys paradox as being primarily a Bayesiano-Bayesian issue. So just the opposite of the perspective taken by the paper. That frequentist solutions differ does not strike me as paradoxical. Now, the construction of a sequence of samples such that *all *partial samples in the sequence exclude the null θ=θ_{0} is not a likely event, so I do not see this as a paradox even or especially when putting on my frequentist glasses: if the null θ=θ_{0} is true, this cannot happen in a consistent manner, even though a *single* occurrence of a p-value less than .05 is highly likely within such a sequence.

Unsurprisingly, the paper relates to the three most recent papers published by *Philosophy of Science*, discussing first and foremost Spanos‘ view. When the current author introduces Mayo and Spanos’ severity, i.e. the probability to exceed the observed test statistic under the alternative, he does not define this test statistic d(X), which makes the whole notion incomprehensible to a reader not already familiar with it. (And even for one familiar with it…)

“Hence, the solution I propose (…) avoids one of [Freeman’s] major disadvantages. I suggest that we should decrease the size of tests to the extent where it makes practically no difference to the power of the test in order to improve the likelihood ratio of a significant result.” (p.11)

One interesting if again unsurprising point in the paper is that one reason for the paradox stands in *keeping the significance level constant* as the sample size increases. While it is possible to decrease the significance level *and* to increase the power simultaneously. However, the solution proposed above does not sound rigorous hence I fail to understand how low the significance has to be for the method to stop/work. I cannot fathom a corresponding algorithmic derivation of the author’s proposal.

“I argue against the intuitive idea that a significant result given by a very powerful test is less convincing than a significant result given by a less powerful test.”

The criticism on the “blind spot” of the Bayesian approach is supported by an example where the data is issued from a distribution other than either of the two tested distributions. It seems reasonable that the Bayesian answer fails to provide a proper answer in this case. Even though it illustrates the difficulty with the long-term impact of the prior(s) in the Bayes factor and (in my opinion) the need to move away from this solution within the Bayesian paradigm.

February 22, 2015 at 7:37 pm

Glad to see that someone is discussing my paper. Your criticisms prompt several questions. For example, I wonder why you claim that my discussion of the `blind spot` relies on an example where the observations are not generated by one of the two tested hypotheses? I explicitly construct my computations on the assumptions and there are no alternatives to the two hypotheses under consideration.