since no non-constant function of θ allows for a best unbiased estimator.

Looking in particular at the location parameter of a Cauchy distribution, I realised that the Pitman best equivariant estimator is unbiased as well [for all location problems] and hence dominates the (equivariant) maximum likelihood estimator which is unbiased in this symmetric case. However, as detailed in a nice paper of Gabriela Freue on this problem, I further discovered that there is no uniformly minimal variance estimator and no uniformly minimal variance unbiased estimator! (And that the Pitman estimator enjoys a closed form expression, as opposed to the maximum likelihood estimator.) This sounds a bit paradoxical but simply means that there exists different unbiased estimators which variance functions are not ordered and hence not comparable. Between them and with the variance of the Pitman estimator.

]]>

“We will now resolve Lindley’s paradox in both of the above examples.”

The “resolution” of the paradox stands in stating the well-known consistency of the Bayes factor, i.e., that as the sample grows to infinity it goes to infinity (almost surely) under the null hypothesis and to zero under the alternative (almost surely again, both statements being for fixed parameters.) Hence the discrepancy between a small p-value and a Bayes factor favouring the null occurs “with vanishingly small” probability. (The authors distinguish between Bartlett’s paradox associated with a prior variance going to infinity [or a prior becoming improper] and Lindley-Jeffreys’ paradox associated with a sample size going to infinity.)

“We construct cake priors using the following ingredients”

The “cake” priors are defined as pseudo-normal distributions, pseudo in the sense that they look like multivariate Normal densities, except for the covariance matrix that also depends on the parameter, as e.g. in the Fisher information matrix. This reminds me of a recent paper of Ronald Gallant in the Journal of Financial Econometrics that I discussed. With the same feature. Except for a scale factor inversely log-proportional to the dimension of the model. Now, what I find most surprising, besides the lack of parameterisation invariance, is that these priors are not normalised. They do no integrate to one. As to whether or not they integrate, the paper keeps silent about this. This is also a criticism I addressed to Gallant’s paper, getting no satisfactory answer. This is a fundamental shortcoming of the proposed cake priors…

“Hence, the relative rates that g⁰ and g¹ diverge must be considered”

The authors further argue (p.12) that by pushing the scale factors to infinity one produces the answer the Jeffreys prior would have produced. This is not correct since the way the scale factors diverge, relative to one another, drives the numerical value of the limit! Using inversely log-proportionality in the dimension(s) of the model(s) is a correct solution, from a mathematical perspective. But only from a mathematical perspective.

“…comparing the LRT and Bayesian tests…”

Since the log-Bayes factor is the log-likelihood ratio modulo the ν log(n) BIC correction, it is not very surprising that both approaches reach close answers when the scale goes to infinity and the sample size n as well. In the end, there seems to be no reason for going that path other than making likelihood ratio and Bayes factor asymptotically coincide, which does not sound like a useful goal to me. (And so does recovering BIC in the linear model.)

“No papers in the model selection literature, to our knowledge, chose different constants for each model under consideration.”

In conclusion, the paper sets up a principled or universal way to cho<a href=”https://academic.oup.com/jfec/article-abstract/14/2/265/1751312?redirectedFrom=fulltext”></a><a href=”https://xiaose “cake” priors fighting Lindley-Jeffreys’ paradox, but the choices made therein remain arbitrary. They allow for a particular limit to be found when the scale parameter(s) get to infinity, but the limit depends on the connection created between the models, which should not share parameters if one is to be chosen. (The discussion of using improper priors and arbitrary constants is aborted, resorting to custom arguments as the above.) The paper thus unfortunately does not resolve Lindley-Jeffreys’ paradox and the vexing issue of improper priors unfit for testing.

]]>For instance, the limiting distribution of the log-likelihood of an exponential sample at the true value of the parameter τ is not asymptotically Gaussian but almost surely infinite. While the log of the (Wilks) likelihood ratio at the true value of τ is truly (if asymptotically) a Χ² variable with one degree of freedom. That it is not a Gaussian is deemed a “paradox” by the author, explained by a cancellation of first order terms… Same thing again for the common Gaussian mean problem!

]]>