comparison of Bayesian predictive methods for model selection
“Dupuis and Robert (2003) proposed choosing the simplest model with enough explanatory power, for example 90%, but did not discuss the effect of this threshold for the predictive performance of the selected models. We note that, in general, the relative explanatory power is an unreliable indicator of the predictive performance of the submodel,”
Juho Piironen and Aki Vehtari arXived a survey on Bayesian model selection methods that is a sequel to the extensive survey of Vehtari and Ojanen (2012). Because most of the methods described in this survey stem from Kullback-Leibler proximity calculations, it includes some description of our posterior projection method with Costas Goutis and Jérôme Dupuis. We indeed did not consider prediction in our papers and even failed to include consistency result, as I was pointed out by my discussant in a model choice meeting in Cagliari, in … 1999! Still, I remain fond of the notion of defining a prior on the embedding model and of deducing priors on the parameters of the submodels by Kullback-Leibler projections. It obviously relies on the notion that the embedding model is “true” and that the submodels are only approximations. In the simulation experiments included in this survey, the projection method “performs best in terms of the predictive ability” (p.15) and “is much less vulnerable to the selection induced bias” (p.16).
Reading the other parts of the survey, I also came to the perspective that model averaging makes much more sense than model choice in predictive terms. Sounds obvious stated that way but it took me a while to come to this conclusion. Now, with our mixture representation, model averaging also comes as a natural consequence of the modelling, a point presumably not stressed enough in the current version of the paper. On the other hand, the MAP model now strikes me as artificial and linked to a very rudimentary loss function. A loss that does not account for the final purpose(s) of the model. And does not connect to the “all models are wrong” theorem.