Our main target is high-dimensional problems where it is indeed worth the computation to compute the gradient (a la Spall, but not finite difference, Spall’s 2SPSA algorithm) and use stochastic gradient versions of HMC (and btw, where we do not need to compute the Hessian). The goal is to get an indication of the direction of the gradient in high dimensions, with a small number of simulations.

We agree that epsilon should be estimated appropriately for each problem, perhaps as part of the Markov chain. For the purposes of comparing k-eps and SL gradient estimates, we chose to keep it fixed at a small fraction of the simulator noise at the MAP theta, hence 0.37. At this value the posterior approximation is good for ABC-MCMC and it mixes well. The point of this exercise was to point out that gradients from k-eps have large variance for reasonable epsilon even when we add more simulations. This is important for hamiltonian-ABC since large variances force us to turn down the step-size, reducing the benefit of using HMC in the first place. We agree that using a kernel density model with epsilon as the bandwidth may give us the desired unbiased and low-variance gradient estimates.

As for the random seeds, common, persistent or otherwise, we agree that this requires more clarification in the paper and also more investigation to determine the limitations, if any, of controlling the random generator of a black-box simulator. We do not know for certain if we can represent *all* random generators as deterministic functions of a fixed number of random uniforms. You are probably right thinking of it as a random number of random uniforms.

Thanks again for posting and apologies to your printer!

]]>