## evaluating stochastic algorithms

**R**einaldo sent me this email a long while ago

Could you recommend me a nice reference about measures to evaluate stochastic algorithms (in particular focus in approximating posterior distributions).

and I hope he is still reading the ‘Og, despite my lack of prompt reply! I procrastinated and procrastinated in answering this question as I did not have a ready reply… We have indeed seen (almost suffered from!) a flow of MCMC convergence diagnostics in the 90’s. And then it dried out. Maybe because of the impossibility to be “really” sure, unless running one’s MCMC much longer than “necessary to reach” stationarity and convergence. The heat of the dispute between the “single chain school” of Geyer (1992, Statistical Science) and the “multiple chain school” of Gelman and Rubin (1992, Statistical Science) has since long evaporated. My feeling is that people (still) run their MCMC samplers several times and check for coherence between the outcomes. Possibly using different kernels on parallel threads. At best, but rarely, they run (one or another form of) tempering to identify the modal zones of the target. And instances where non-trivial control variates are available are fairly rare. Hence, a *non-sequitur* reply at the MCMC level. As there is no automated tool available, in my opinion. (Even though I did not check the latest versions of BUGS.)

**A**s it happened, Didier Chauveau from Orléans gave today a talk at Big’MC on convergence assessment based on entropy estimation, a joint work with Pierre Vandekerkhove. He mentioned SamplerCompare which is an R package that appeared in 2010. Soon to come is their own EntropyMCMC package, using parallel simulation. And k-nearest neighbour estimation.

**I**f I re-interpret the question as focussed on ABC algorithms, it gets both more delicate and easier. Easy because each ABC distribution is different. So there is no reason to look at the unreachable original target. Delicate because there are several parameters to calibrate (tolerance, choice of summary, …) on top of the number of MCMC simulations. In DIYABC, the outcome is always made of the superposition of several runs to check for stability (or lack thereof). But this tells us nothing about the distance to the true original target. The obvious but impractical answer is to use some basic bootstrapping, as it is generally much too costly.

February 20, 2014 at 11:41 am

Perhaps more studies should be developed, for example, when we

are approximating a sequence of target distributions (eg SMC and its variants) using a fixed number number of weighted samples (too costly in many cases). It is not clear for me that using only ordinary MSE is enough as has been appeared in many articles. Professor Gneiting wrote a very interesting collection of papers about prediction and combining predictive distributions. In this spirit (but out forecast context), new approaches to evaluate stochastic algorithms taking into account computational complexity, performance, +++, can be more explored…

Reinaldo Marques

February 20, 2014 at 9:34 am

[…] article was first published on Xi'an's Og » R, and kindly contributed to […]

February 20, 2014 at 12:23 am

Along a different thread, there’s the stuff that Chris Holmes talked about at MCMSki (I’m not sure if there’s a paper yet). He was focussing on mis-specified models, but in some sense the inference scheme is part of the “model”, so his methods give a way to explore the effects of perturbations in the computed posterior to a particular decision.

February 20, 2014 at 9:09 am

Thanks, Dan: I like this perspective as it replaces the (wrong, always wrong) model with the uses (usages?) one wants to make of the (wrong, always wrong) model.