all models are wrong

“Using ABC to evaluate competing models has various hazards and comes with recommended precautions (Robert et al. 2011), and unsurprisingly, many if not most researchers have a healthy scepticism as these tools continue to mature.”

Michael Hickerson just published an open-access letter with the above title in Molecular Ecology. (As in several earlier papers, incl. the (in)famous ones by Templeton, Hickerson confuses running an ABC algorithm with conducting Bayesian model comparison, but this is not the main point of this post.)

“Rather than using ABC with weighted model averaging to obtain the three corresponding posterior model probabilities while allowing for the handful of model parameters (θ, τ, γ, Μ) to be estimated under each model conditioned on each model’s posterior probability, these three models are sliced up into 143 ‘submodels’ according to various parameter ranges.”

The letter is in fact a supporting argument for the earlier paper of Pelletier and Carstens (2014, Molecular Ecology) which conducted the above splitting experiment. I could not read this paper so cannot judge of the relevance of splitting this way the parameter range. From what I understand it amounts to using mutually exclusive priors by using different supports.

“Specifically, they demonstrate that as greater numbers of the 143 sub-models are evaluated, the inference from their ABC model choice procedure becomes increasingly.”

An interestingly cut sentence. Increasingly unreliable? mediocre? weak?

“…with greater numbers of models being compared, the most probable models are assigned diminishing levels of posterior probability. This is an expected result…”

True, if the number of models under consideration increases, under a uniform prior over model indices, the posterior probability of a given model mechanically decreases. But the pairwise Bayes factors should not be impacted by the number of models under comparison and the letter by Hickerson states that Pelletier and Carstens found the opposite:

“…pairwise Bayes factor[s] will always be more conservative except in cases when the posterior probabilities are equal for all models that are less probable than the most probable model.”

Which means that the “Bayes factor” in this study is computed as the ratio of a marginal likelihood and of a compound (or super-marginal) likelihood, averaged over all models and hence incorporating the prior probabilities of the model indices as well. I had never encountered such a proposal before. Contrary to the letter’s claim:

“…using the Bayes factor, incorporating all models is perhaps more consistent with the Bayesian approach of incorporating all uncertainty associated with the ABC model choice procedure.”

Besides the needless inclusion of ABC in this sentence, a somewhat confusing sentence, as Bayes factors are not, stricto sensu, Bayesian procedures since they remove the prior probabilities from the picture.

“Although the outcome of model comparison with ABC or other similar likelihood-based methods will always be dependent on the composition of the model set, and parameter estimates will only be as good as the models that are used, model-based inference provides a number of benefits.”

All models are wrong but the very fact that they are models allows for producing pseudo-data from those models and for checking if the pseudo-data is similar enough to the observed data. In components that matters the most for the experimenter. Hence a loss function of sorts…

3 Responses to “all models are wrong”

  1. Your blog continues to, increasingly. It remains.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.