## Reference prior for logistic regression

**G**elman et al. just published a paper in the ** Annals of Applied Statistics** on the selection of a prior on the parameters of a logistic regression. The idea is to scale the prior in terms of the impact of a “typical” change in a covariate onto the probability function, which is reasonable as long as there is enough independence between those covariates. The covariates are primarily rescaled to all have the same expected range, which amounts to me to a kind of empirical Bayes estimation of the scales in an unormalised problem. The parameters are then associated with independent Cauchy (or t) priors, whose scale

*is chosen as 2.5 in order to make the ±5 logistic range the extremal value. The perspective is well-motivated within the paper, and supported in addition by the availability of an R package called*

**s****bayesglm**.

**T**his being said, I would have liked to see a comparison of **bayesglm**. with the generalised g-prior perspective we develop in * Bayesian Core* rather than with the flat prior, which is not the correct Jeffreys’ prior and which anyway does not always lead to a proper prior. In fact, the independent prior seems too rudimentary in the case of many (inevitably correlated) covariates, with the scale of 2.5 being then too large even when brought back to a reasonable change in the covariate. On the other hand, starting with a g-like-prior on the parameters and using a non-informative prior on the factor

*allows for both a natural data-based scaling and an accounting of the dependence between the covariates. This non-informative prior on*

**g****then amounts to a generalised t prior on the parameter, once**

*g**is integrated. Anyone interested in the comparison can use the functions provided here on the webpage of*

**g***. (The paper already includes a comparison with Jeffreys’ prior implemented as*

**Bayesian Core****brglm**and the BBR algorithm of Genkins et al. (2007).) In the revision of Bayesian Core, we will most likely draw this comparison.

January 25, 2009 at 1:05 pm

[…] In connection with the discussion about reference priors for logistic regression posted two weeks ago, Aleks Jakulin pointed out the possibility to embed the slides for Bayesian Core that […]

January 24, 2009 at 11:05 am

In our paper we took care of scaling in a very radical way: all continuous variables were discretized and only took values of 0 or 1.

I agree that the problem of scaling as well as the problem of inter-predictor correlations are important, and I’m looking forward to seeing how this is handled in Bayesian Core. A PDF of the relevant chapter sent via email would be helpful, as I’ll forget about the problem by the time I actually put my hands on the book.

While the models are all fine, the challenge is to

implementthe model in a robust and efficient fashion so that it would survive the brutal testing on the corpus. I’ll try to put the code and data out there so that others can make their code sufficiently robust.January 23, 2009 at 9:32 pm

Interesting idea. I agree that it makes sense to use a hierarchical model for the coefficients so that they are scaled relative to each other.

Regarding the pre-scaling that we do: I think something of this sort is necessary in order to be able to incorporate prior information. For example, if you are regressing earnings on height, it makes a difference if height is in inches, feet, meters, kilometers, etc. (Although any scale is ok if you take logs first.) I agree that the pre-scaling can be thought of as an approximation to a more formal hierarchical model of the scaling. Aleks and I discussed this when working on the bayesglm project, but it wasn’t clear how to easily implement such scaling. It’s possible that the t-family prior can be interpreted as some sort of mixture with a normal prior on the scaling.