precision in MCMC

presisio21 presisio22

While browsing Images des Mathématiques, I came across this article [in French] that studies the impact of round-off errors on number representations in a dynamical system and checked how much this was the case for MCMC algorithms like the slice sampler (recycling some R code from Monte Carlo Statistical Methods). By simply adding a few signif(…,dig=n) in the original R code. And letting the precision n vary.

presisio31 presisio32

“…si on simule des trajectoires pendant des intervalles de temps très longs, trop longs par rapport à la précision numérique choisie, alors bien souvent, les résultats des simulations seront complètement différents de ce qui se passe en réalité…” Pierre-Antoine Guihéneuf

Rather unsurprisingly (!), using a small enough precision (like two digits on the first row) has a visible impact on the simulation of a truncated normal. Moving to three digits seems to be sufficient in this example… One thing this tiny experiment reminds me of is the lumpability property of Kemeny and Snell.  A restriction on Markov chains for aggregated (or discretised) versions to be ergodic or even Markov. Also, in 2000, Laird Breyer, Gareth Roberts and Jeff Rosenthal wrote a Statistics and Probability Letters paper on the impact of round-off errors on geometric ergodicity. However, I presume [maybe foolishly!] that the result stated in the original paper, namely that there exists an infinite number of precision digits for which the dynamical system degenerates into a small region of the space does not hold for MCMC. Maybe foolishly so because the above statement means that running a dynamical system for “too” long given the chosen precision kills the intended stationary properties of the system. Which I interpret as getting non-ergodic behaviour when exceeding the period of the uniform generator. More or less.

presisio91 presisio92

4 Responses to “precision in MCMC”

  1. This is one of those things that I probably failed to get across the other week at that Big Models meeting. A lot of the times (such a GP regression with squared exponential covariance functions), you’ll really struggle to get even two correct decimal places for the intermediate calculations. This, to me, kills any idea that MCMC (or any other computation method) will target the correct posterior. It may not even be close.

    I assume that there are a core of people in the ML and BNP communities having conversations about these sorts of things (given how unavoidable they are when combining Gaussian processes with big data). To some extent this paper
    http://arxiv.org/pdf/1501.06195v1.pdf
    will solve the problem, but it’s focussing more on the question of “how much information do we need to solve the problem” rather than “how big can a problem in this class be and still be solved on a computer?”, which is just a critical.

    • Which is probably why one can solve almost anything with linear regression: the model error of the approximate error, is smaller than the numerical error of the correct model.

      In a little bit more serious note, I was always wondering whether single precision floating point may not be as good or as bad as double precision

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s