Archive for parametric bootstrap

adjustment of bias and coverage for confidence intervals

Posted in Statistics with tags , , , , , , , , on October 18, 2012 by xi'an

Menéndez, Fan, Garthwaite, and Sisson—whom I heard in Adelaide on that subject—posted yesterday a paper on arXiv about correcting the frequentist coverage of default intervals toward their nominal level. Given such an interval [L(x),U(x)], the correction for proper frequentist coverage is done by parametric bootstrap, i.e. by simulating n replicas of the original sample from the pluggin density f(.|θ*) and deriving the empirical cdf of L(y)-θ*. And of U(y)-θ*. Under the assumption of consistency of the estimate θ*, this ensures convergence (in the original sampled size) of the corrected bounds.

Since ABC is based on the idea that pseudo data can be simulated from f(.|θ) for any value of θ, the concept “naturally” applies to ABC outcomes, as illustrated in the paper by a g-and-k noise MA(1) model. (As noted by the authors, there always is some uncertainty with the consistency of the ABC estimator.) However, there are a few caveats:

  • ABC usually aims at approximating the posterior distribution (given the summary statistics), of which the credible intervals are an inherent constituent. Hence, attempts at recovering a frequentist coverage seem contradictory with the original purpose of the method. Obviously, if ABC is instead seen as an inference method per se, like indirect inference, this objection does not hold.
  • Then, once the (umbilical) link with Bayesian inference is partly severed, there is no particular reason to stick to credible sets for [L(x),U(x)]. A more standard parametric bootstrap approach, based on the bootstrap distribution of θ*, should work as well. This means that a comparison with other frequentist methods like indirect inference could be relevant.
  • At last, and this is also noted by the authors, the method may prove extremely expensive. If the bounds L(x) and U(x) are obtained empirically from an ABC sample, a new ABC computation must be associated with each one of the n replicas of the original sample. It would be interesting to compare the actual coverages of this ABC-corrected method with a more direct parametric bootstrap approach.

Bayesian inference and the parametric bootstrap

Posted in R, Statistics, University life with tags , , , , , , , , on December 16, 2011 by xi'an

This paper by Brad Efron came to my knowledge when I was looking for references on Bayesian bootstrap to answer a Cross Validated question. After reading it more thoroughly, “Bayesian inference and the parametric bootstrap” puzzles me, which most certainly means I have missed the main point. Indeed, the paper relies on parametric bootstrap—a frequentist approximation technique mostly based on simulation from a plug-in distribution and a robust inferential method estimating distributions from empirical cdfs—to assess (frequentist) coverage properties of Bayesian posteriors. The manuscript mixes a parametric bootstrap simulation output for posterior inference—even though bootstrap produces simulations of estimators while the posterior distribution operates on the parameter space, those  estimator simulations can nonetheless be recycled as parameter simulation by a genuine importance sampling argument—and the coverage properties of Jeffreys posteriors vs. the BCa [which stands for bias-corrected and accelerated, see Efron 1987] confidence density—which truly take place in different spaces. Efron however connects both spaces by taking advantage of the importance sampling connection and defines a corrected BCa prior to make the confidence intervals match. While in my opinion this does not define a prior in the Bayesian sense, since the correction seems to depend on the data. And I see no strong incentive to match the frequentist coverage, because this would furthermore define a new prior for each component of the parameter. This study about the frequentist properties of Bayesian credible intervals reminded me of the recent discussion paper by Don Fraser on the topic, which follows the same argument that Bayesian credible regions are not necessarily good frequentist confidence intervals.

The conclusion of the paper is made of several points, some of which may not be strongly supported by the previous analysis:

  1. “The parametric bootstrap distribution is a favorable starting point for importance sampling computation of Bayes posterior distributions.” [I am not so certain about this point given that the bootstrap is based on a pluggin estimate, hence fails to account for the variability of this estimate, and may thus induce infinite variance behaviour, as in the harmonic mean estimator of Newton and Raftery (1994). Because the tails of the importance density are those of the likelihood, the heavier tails of the posterior induced by the convolution with the prior distribution are likely to lead to this fatal misbehaviour of the importance sampling estimator.]
  2. “This computation is implemented by reweighting the bootstrap replications rather than by drawing observations directly from the posterior distribution as with MCMC.” [Computing the importance ratio requires the availability both of the likelihood function and of the likelihood estimator, which means a setting where Bayesian computations are not particularly hindered and do not necessarily call for advanced MCMC schemes.]
  3. “The necessary weights are easily computed in exponential families for any prior, but are particularly simple starting from Jeffreys invariant prior, in which case they depend only on the deviance difference.” [Always from a computational perspective, the ease of computing the importance weights is mirrored by the ease in handling the posterior distributions.]
  4. “The deviance difference depends asymptotically on the skewness of the family, having a cubic normal form.” [No relevant comment.]
  5. “In our examples, Jeffreys prior yielded posterior distributions not much different than the unweighted bootstrap distribution. This may be unsatisfactory for single parameters of interest in multi-parameter families.” [The frequentist confidence properties of Jeffreys priors have already been examined in the past and be found to be lacking in multidimensional settings. This is an assessment finding Jeffreys priors lacking from a frequentist perspective. However, the use of Jeffreys prior is not justified on this particular ground.]
  6. “Better uninformative priors, such as the Welch and Peers family or reference priors, are closely related to the frequentist BCa reweighting formula.” [The paper only finds proximities in two examples, but it does not assess this relation in a wider generality. Again, this is not particularly relevant from a Bayesian viewpoint.]
  7. “Because of the i.i.d. nature of bootstrap resampling, simple formulas exist for the accuracy of posterior computations as a function of the number B of bootstrap replications. Even with excessive choices of B, computation time was measured in seconds for our examples.” [This is not very surprising. It however assesses Bayesian procedures from a frequentist viewpoint, so this may be lost on both Bayesian and frequentist users…]
  8. “An efficient second-level bootstrap algorithm (“bootstrap-after-bootstrap”) provides estimates for the frequentist accuracy of Bayesian inferences.” [This is completely correct and why bootstrap is such an appealing technique for frequentist inference. I spent the past two weeks teaching non-parametric bootstrap to my R class and the students are now fluent with the concept, even though they are unsure about the meaning of estimation and testing!]
  9. “This can be important in assessing inferences based on formulaic priors, such as those of Jeffreys, rather than on genuine prior experience.” [Again, this is neither very surprising nor particularly appealing to Bayesian users.]

In conclusion, I found the paper quite thought-provoking and stimulating, definitely opening new vistas in a very elegant way. I however remain unconvinced by the simulation aspects from a purely Monte Carlo perspective.