## Thesis defense in València

Posted in Statistics, Travel, University life, Wines with tags , , , , , , on February 25, 2011 by xi'an

On Monday, I took part in the jury of the PhD thesis of Anabel Forte Deltel, in the department of statistics of the Universitat de València. The topic of the thesis was variable selection in Gaussian linear models using an objective Bayes approach. Completely on my own research agenda! I had already discussed with Anabel in Zürich, where she gave a poster and gave me a copy of her thesis, so could concentrate on the fundamentals of her approach during the defense. Her approach extends Liang et al. (2008, JASA) hyper-g prior in a complete analysis of the conditions set by Jeffreys in his book for constructing such priors. She is therefore able to motivate a precise value for most hyperparameters (although some choices were mainly based on computational reasons opposing 2F1 with Appell’s F1 hypergeometric functions). She also defends the use of an improper prior by an invariance argument that leads to the standard Jeffreys’ prior on location-scale. (This is where I prefer the approach in Bayesian Core that does not discriminate between a subset of the covariates including the intercept and the other covariates. Even though it is not invariant by location-scale transforms.) After the defence, Jim Berger pointed out to me that the modelling allowed for the subset to be empty, which would then cancel my above objection! In conclusion, this thesis could well set a reference prior (if not in José Bernardo’s sense of the term!) for Bayesian linear model analysis in the coming years.

## Bayes vs. SAS

Posted in Books, R, Statistics with tags , , , , , , , , , , , , , , , , , , on May 7, 2010 by xi'an

Glancing perchance at the back of my Amstat News, I was intrigued by the SAS advertisement

Bayesian Methods

• Specify Bayesian analysis for ANOVA, logistic regression, Poisson regression, accelerated failure time models and Cox regression through the GENMOD, LIFEREG and PHREG procedures.
• Analyze a wider variety of models with the MCMC procedure, a general purpose Bayesian analysis procedure.

and so decided to take a look at those items on the SAS website. (Some entries date back to 2006 so I am not claiming novelty in this post, just my reading through the manual!)

Even though I have not looked at a SAS program since the time in 1984 I was learning principal component and discriminant analysis by programming SAS procedures on punched cards, it seems the MCMC part is rather manageable (if you can manage SAS at all!), looking very much like a second BUGS to my bystander eyes, even to the point of including ARS algorithms! The models are defined in a BUGS manner, with priors on the side (and this includes improper priors, despite a confusing first example that mixes very large variances with vague priors for the linear model!). The basic scheme is a random walk proposal with adaptive scale or covariance matrix. (The adaptivity on the covariance matrix is slightly confusing in that the way it is described it does not seem to implement the requirements of Roberts and Rosenthal for sure convergence.) Gibbs sampling is not directly covered, although some examples are in essence using Gibbs samplers. Convergence is assessed via ca. 1995 methods à la Cowles and Carlin, including the rather unreliable Raftery and Lewis indicator, but so does Introducing Monte Carlo Methods with R, which takes advantage of the R coda package. I have not tested (!) any of the features in the MCMC procedure but judging from a quick skim through the 283 page manual everything looks reasonable enough. I wonder if anyone has ever tested a SAS program against its BUGS counterpart for efficiency comparison.

The Bayesian aspects are rather traditional as well, except for the testing issue. Indeed, from what I have read, SAS does not engage into testing and remains within estimation bounds, offering only HPD regions for variable selection without producing a genuine Bayesian model choice tool. I understand the issues with handling improper priors versus computing Bayes factors, as well as some delicate computational requirements, but this is a truly important chunk missing from the package. (Of course, the package contains a DIC (Deviance information criterion) capability, which may be seen as a substitute, but I have reservations about the relevance of DIC outside generalised linear models. Same difficulty with the posterior predictive.) As usual with SAS, the documentation is huge (I still remember the shelves after shelves of documentation volumes in my 1984 card-punching room!) and full of options and examples. Nothing to complain about. Except maybe the list of disadvantages in using Bayesian analysis:

• It does not tell you how to select a prior. There is no correct way to choose a prior. Bayesian inferences require skills to translate prior beliefs into a mathematically formulated prior. If you do not proceed with caution, you can generate misleading results.
• It can produce posterior distributions that are heavily influenced by the priors. From a practical point of view, it might sometimes be difficult to convince subject matter experts who do not agree with the validity of the chosen prior.
• It often comes with a high computational cost, especially in models with a large number of parameters.

which does not say much… Since the MCMC procedure allows for any degree of hierarchical modelling, it is always possible to check the impact of a given prior by letting its parameters go random. I found that most practitioners are happy with the formalisation of their prior beliefs into mathematical densities, rather than adamant about a specific prior. As for computation, this is not a major issue.

$\pi(\theta,\zeta) = \pi_1(\theta) \pi_2(\zeta)$
does not imply that $\pi_2$ is the marginal prior on $\zeta$ when $\pi_1$ is improper. The interpretation of $\pi_2$ as such is what leads to the “paradox” but there is no mathematical difficulty in the issue. Starting with the joint improper prior $\pi(\theta,\zeta)$ leads to an undefined posterior if we only consider the part of the observations that depends on $\zeta$ because $\theta$ does not integrate out. Defining improper priors as limits of proper priors—as Jaynes and Timothy Wallstrom do—can also be attempted from a mathematical point of view, but (a) I do not think a global resolution is possible this way in that all Bayesian procedures for the improper prior cannot be constructed as limits from the corresponding Bayesian procedures for the proper prior sequence, think eg about testing, and (b) this is trying to give a probabilistic meaning to the improper priors and thus gets back to the over-interpretation danger mentioned above. Hence a very nice poster discussion!