Lack of confidence [revised]
Following the comments on our earlier submission to PNAS, we have written (and re-arXived) a revised version where we try to spell out (better) the distinction between ABC point (and confidence) estimation and ABC model choice, namely that the problem was at another level for Bayesian model choice (using posterior probabilities). When doing point estimation with in-sufficient summary statistics, the information content is poorer, but unless one uses very degraded summary statistics, inference is converging. We completely agree with the reviewers that the posterior distribution is different from the true posterior in this case but, at least, gathering more observations brings more information about the parameter (and convergence when the number of observations goes to infinity). For model choice, this is not guaranteed if we use summary statistics that are not inter-model sufficient, as shown by the Poisson and normal examples. Furthermore, except for very specific cases such as Gibbs random fields, it is almost always impossible to derive inter-model sufficient statistics, beyond the raw sample. This is why we consider there is a fundamental difference between point estimation and model choice.
Following the request from a referee, we also ran a more extensive simulation experiment for comparing two scenarios with 3 populations, 100 diploid individuals per population, and 50 loci/markers. However, the results are somehow less conclusive, in the sense that, since we use 50 loci, the data is much more informative about the model and therefore both the importance sampling and the ABC approximations provide a value of the posterior probability approximation that is close to one, hence both concluding with the validation of the true model. Because both approximations are very close to one, it is difficult to assess the worth of the ABC approximation per se, i.e. in numerical terms. (The fact that the statistical conclusion is the same for both approaches is of course satisfying from an inferential perspective, but is an altogether separate issue from our argument about the possible lack of convergence of the ABC Bayes factor approximation to the true Bayes factor.) Furthermore, this experiment may be beyond the manageable/reasonable in the sense that the importance sampling approximation cannot be taken for granted, nor can it be checked empirically. Indeed, with 50 markers and 100 individuals, the product likelihood suffers from an enormous variability that 100,000 particles and 100 trees per locus have trouble to address (despite a huge computing cost of more than 12 days on a powerful cluster).
!Package natbib Error: Bibliography not compatible with author-year citations. Press <return> to continue in numerical citation style. See the natbib package documentation for explanation.
but it vanisheds with the options
which is an easy fix.