In a formal way, the MLE or the Bayes [posterior expectation] estimator could be the “best” summary statistic, were it available. This is one of the core ideas of the paper, actually. Paul Fearnhead and Dennis Prangle use an ABC proxy to the genuine Bayes estimator as a new summary for a second ABC round.

]]>Ahh, yes. You’re right. I made a mistake in thinking. Thanks for your answer. I was thinking that I can contribute in discussing by noting about using data cloning as an alternative approach to obtain summary statistics. You brought me !

]]>The SAME algorithm (also know as prior feedback, data cloning, MCMC maximum likelihood, multiple imputation Metropolis EM, &tc.) can be used to represent the MLE as the limit of a sequence of Bayes estimators against replicas of the data. Now, to compute those Bayes estimators w/o the likelihood function is a bit of getting oneself in a fine pickle, isn’t it?! I simply do not see how you can implement the first step of the suggestion: computing a Bayes estimator for a k-replicate of the original data. If this is feasible, the whole Bayesian analysis of the model is feasible and the ABC shortcut is then superfluous. Please tell me which part I am missing.

]]>You are right that some of our results rely on quite strong assumptions. I’ll try to address your points by explaining the motivation of our approach. The starting point is our Lemma 1, which shows that (asymptotically) the Monte Carlo error increases with the number of summary statistics. So we investigate using one summary statistic for each (non-nuisance) parameter, as using any fewer intuitively leads to identifiability problems. The choice of parameter predictors as the summary statistics is now more a practical choice than a theory driven one: we can construct approximate predictors by our semi-automatic method. Our Theorem 2 shows that this is justifiable in a particular framework, namely point prediction with quadratic loss. Other loss functions lead to different optimal summary statistics which are not so easy to approximate. Noisy ABC is a method to augment the point predictions with meaningful credible intervals. Our Theorem 1 guarantees that these have some sensible coverage properties with respect to the *original* model and data.

Finally, the criticism that we assume the parameters of interest (and the parameterisation) in advance is fair. We think this covers many situations of interest and that a more exploratory setting with a large number of unknown parameters is a harder problem due to the large number of summary statistics required.

]]>**Corey:** I do not think the paper brings a solution in terms of the selection of statistics because one of the results in the paper is that the optimal summary statistics are the estimators of the parameters of interest. This is somehow tautological in my opinion because you have first to determine which parameters (or functions of) you are interested in.