## posterior likelihood ratio is back

Posted in Statistics, University life with tags , , , , , , , , , on June 10, 2014 by xi'an

“The PLR turns out to be a natural Bayesian measure of evidence of the studied hypotheses.”

Isabelle Smith and André Ferrari just arXived a paper on the posterior distribution of the likelihood ratio. This is in line with Murray Aitkin’s notion of considering the likelihood ratio

$f(x|\theta_0) / f(x|\theta)$

as a prior quantity, when contemplating the null hypothesis that θ is equal to θ0. (Also advanced by Alan Birnbaum and Arthur Dempster.) A concept we criticised (rather strongly) in our Statistics and Risk Modelling paper with Andrew Gelman and Judith Rousseau.  The arguments found in the current paper in defence of the posterior likelihood ratio are quite similar to Aitkin’s:

• defined for (some) improper priors;
• invariant under observation or parameter transforms;
• more informative than tthe posterior mean of the posterior likelihood ratio, not-so-incidentally equal to the Bayes factor;
• avoiding using the posterior mean for an asymmetric posterior distribution;
• achieving some degree of reconciliation between Bayesian and frequentist perspectives, e.g. by being equal to some p-values;
• easily computed by MCMC means (if need be).

One generalisation found in the paper handles the case of composite versus composite hypotheses, of the form

$\int\mathbb{I}\left( p(x|\theta_1)

which brings back an earlier criticism I raised (in Edinburgh, at ICMS, where as one-of-those-coincidences, I read this paper!), namely that using the product of the marginals rather than the joint posterior is no more a standard Bayesian practice than using the data in a prior quantity. And leads to multiple uses of the data. Hence, having already delivered my perspective on this approach in the past, I do not feel the urge to “raise the flag” once again about a paper that is otherwise well-documented and mathematically rich.

## Bayesian variable selection redux

Posted in Statistics, University life with tags , , , , , on July 11, 2011 by xi'an

After a rather long interlude, and just in time for the six month deadline!, we (Gilles Celeux, Mohammed El Anbari, Jean-Michel Marin and myself) have resubmitted (and rearXived) our comparative study of Bayesian and non-Bayesian variable selections procedures to Bayesian Analysis. Why it took us so long is a combination of good and bad reasons: besides being far apart, between Morocco, Paris and Montpellier, and running too many projects at once with Jean-Michel (including the Bayesian Core revision that did not move much since last summer!), we came to realise that my earlier strong stance that invariance on the intercept did not matter was not right and that the (kind) reviewers were correct about the asymptotic impact of the scale of the intercept on the variable selection, so we had first to reconvene and think about it, before running another large round of simulations. We hope the picture is now clearer.

## Thesis defense in València

Posted in Statistics, Travel, University life, Wines with tags , , , , , , on February 25, 2011 by xi'an

On Monday, I took part in the jury of the PhD thesis of Anabel Forte Deltel, in the department of statistics of the Universitat de València. The topic of the thesis was variable selection in Gaussian linear models using an objective Bayes approach. Completely on my own research agenda! I had already discussed with Anabel in Zürich, where she gave a poster and gave me a copy of her thesis, so could concentrate on the fundamentals of her approach during the defense. Her approach extends Liang et al. (2008, JASA) hyper-g prior in a complete analysis of the conditions set by Jeffreys in his book for constructing such priors. She is therefore able to motivate a precise value for most hyperparameters (although some choices were mainly based on computational reasons opposing 2F1 with Appell’s F1 hypergeometric functions). She also defends the use of an improper prior by an invariance argument that leads to the standard Jeffreys’ prior on location-scale. (This is where I prefer the approach in Bayesian Core that does not discriminate between a subset of the covariates including the intercept and the other covariates. Even though it is not invariant by location-scale transforms.) After the defence, Jim Berger pointed out to me that the modelling allowed for the subset to be empty, which would then cancel my above objection! In conclusion, this thesis could well set a reference prior (if not in José Bernardo’s sense of the term!) for Bayesian linear model analysis in the coming years.

## Back from Philly

Posted in R, Statistics, Travel, University life with tags , , , , , , , , , on December 21, 2010 by xi'an

## Versions of Benford’s Law

Posted in Books, Statistics with tags , , , , on May 20, 2010 by xi'an

A new arXived note by Berger and Hill discusses how [my favourite probability introduction] Feller’s Introduction to Probability Theory (volume 2) gets Benford’s Law “wrong”. While my interest in Benford’s Law is rather superficial, I find the paper of interest as it shows a confusion between different folk theorems! My interpretation of Benford’s Law is that the first significant digit of a random variable (in a basis 10 representation) is distributed as

$f(i) \propto \log_{10}(1+\frac{1}{i})$

and not that $\log(X) \,(\text{mod}\,1)$ is uniform, which is the presentation given in the arXived note…. The former is also the interpretation of William Feller (page 63, Introduction to Probability Theory), contrary to what the arXived note seems to imply on page 2, but Feller indeed mentioned as an informal/heuristic argument in favour of Benford’s Law that when the spread of the rv X is large,  $\log(X)$ is approximately uniformly distributed. (I would no call this a “fundamental flaw“.) The arXived note is then right in pointing out the lack of foundation for Feller’s heuristic, if muddling the issue by defining several non-equivalent versions of Benford’s Law. It is also funny that this arXived note picks at the scale-invariant characterisation of Benford’s Law when Terry Tao’s entry represents it as a special case of Haar measure!

## More on Benford’s Law

Posted in Statistics with tags , , , , on July 10, 2009 by xi'an

In connection with an earlier post on Benford’s Law, i.e. the probability that the first digit of a random variable X is$1\le k\le 9$is approximately$\log\{(k+1)/k\}$—you can easily check that the sum of those probabilities is 1—, I want to signal a recent entry on Terry Tiao’s impressive blog. Terry points out that Benford’s Law is the Haar measure in that setting, but he also highlights a very peculiar absorbing property which is that, if$X$follows Benford’s Law, then$XY$also follows Benford’s Law for any random variable$Y$that is independent from$X$… Now, the funny thing is that, if you take a normal sample$x_1,\ldots,x_n$and check whether or not Benford’s Law applies to this sample, it does not. But if you take a second normal sample$y_1,\ldots,y_n$and consider the product sample$x_1\times y_1,\ldots,x_n\times y_n$, then Benford’s Law applies almost exactly. If you repeat the process one more time, it is difficult to spot the difference. Here is the [rudimentary—there must be a more elegant way to get the first significant digit!] R code to check this:

x=abs(rnorm(10^6))
b=trunc(log10(x)) -(log(x)<0)
plot(hist(trunc(x/10^b),breaks=(0:9)+.5)$den,log10((2:10)/(1:9)), xlab="Frequency",ylab="Benford's Law",pch=19,col="steelblue") abline(a=0,b=1,col="tomato",lwd=2) x=abs(rnorm(10^6)*x) b=trunc(log10(x)) -(log(x)<0) points(hist(trunc(x/10^b),breaks=(0:9)+.5,plot=F)$den,log10((2:10)/(1:9)),
pch=19,col="steelblue2")
x=abs(rnorm(10^6)*x)
b=trunc(log10(x)) -(log(x)<0)
points(hist(trunc(x/10^b),breaks=(0:9)+.5,plot=F)\$den,log10((2:10)/(1:9)),
pch=19,col="steelblue3")

Even better, if you change rnorm to another generator like rcauchy or rexp at any of the three stages, the same pattern occurs.