Archive for prior projection

projective covariate selection

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on October 28, 2014 by xi'an

While I was in Warwick, Dan Simpson [newly arrived from Norway on a postdoc position] mentioned to me he had attended a talk by Aki Vehtari in Norway where my early work with Jérôme Dupuis on projective priors was used. He gave me the link to this paper by Peltola, Havulinna, Salomaa and Vehtari that indeed refers to the idea that a prior on a given Euclidean space defines priors by projections on all subspaces, despite the zero measure of all those subspaces. (This notion first appeared in a joint paper with my friend Costas Goutis, who alas died in a diving accident a few months later.) The projection further allowed for a simple expression of the Kullback-Leibler deviance between the corresponding models and for a Pythagorean theorem on the additivity of the deviances between embedded models. The weakest spot of this approach of ours was, in my opinion and unsurprisingly, about deciding when a submodel was too far from the full model. The lack of explanatory power introduced therein had no absolute scale and later discussions led me to think that the bound should depend on the sample size to ensure consistency. (The recent paper by Nott and Leng that was expanding on this projection has now appeared in CSDA.)

“Specifically, the models with subsets of covariates are found by maximizing the similarity of their predictions to this reference as proposed by Dupuis and Robert [12]. Notably, this approach does not require specifying priors for the submodels and one can instead focus on building a good reference model. Dupuis and Robert (2003) suggest choosing the size of the covariate subset based on an acceptable loss of explanatory power compared to the reference model. We examine using cross-validation based estimates of predictive performance as an alternative.” T. Peltola et al.

The paper also connects with the Bayesian Lasso literature, concluding on the horseshoe prior being more informative than the Laplace prior. It applies the selection approach to identify biomarkers with predictive performances in a study of diabetic patients. The authors rank model according to their (log) predictive density at the observed data, using cross-validation to avoid exploiting the data twice. On the MCMC front, the paper implements the NUTS version of HMC with STAN.

Computing evidence

Posted in Books, R, Statistics with tags , , , , , , , , , , on November 29, 2010 by xi'an

The book Random effects and latent variable model selection, edited by David Dunson in 2008 as a Springer Lecture Note. contains several chapters dealing with evidence approximation in mixed effect models. (Incidentally, I would be interested in the story behind the  Lecture Note as I found no explanation in the backcover or in the preface. Some chapters but not all refer to a SAMSI workshop on model uncertainty…) The final chapter written by Joyee Ghosh and David Dunson (similar to a corresponding paper in JCGS) contains in particular the interesting identity that the Bayes factor opposing model h to model h-1 can be unbiasedly approximated by (the average of the terms)

\dfrac{f(x|\theta_{i,h},\mathfrak{M}=h-1)}{f(x|\theta_{i,h},\mathfrak{M}=h)}

when

  • \mathfrak{M} is the model index,
  • the \theta_{i,h}‘s are simulated from the posterior under model h,
  • the model \mathfrak{M}=h-1 only considers the h-1 first components of \theta_{i,h},
  • the prior under model h-1 is the projection of the prior under model h. (Note that this marginalisation is not the projection used in Bayesian Core.)

Continue reading

Follow

Get every new post delivered to your Inbox.

Join 700 other followers