Archive for coupling from the past

MCMC for conditional Bernoullis

Posted in Books, Statistics, University life with tags , , , , , on February 22, 2021 by xi'an

Jeremy Heng, Pierre Jacob [currently in Paris!] and Nianqiao Ju are in a recent arXival considering the simulation of a conditional Bernoulli, namely generating a vector of N Bernoullis with different probabilities under the constraint that their sum is fixed as I. Rather than going for a perfect simulator, with cost O(NI), they opt for the simplest of MCMC samplers, where a 0 and a 1 entries are exchanged at random. In connection with a recent spate of MCMC works using coupling, they establish convergence in O(N log N) steps, even when the probabilities are arbitrarily close to zero and one. Including the case when they are Uniformly generated. From a mundane perspective, I wonder at the appeal of using the probabilities to select the exchange pair. I realise sorting the probabilities is already of order O(N log N) avoiding selecting highly probable 1’s and highly probable 0’s should speed up converge, unless the gain is negligible. And to link MCMC and exact simulation in this setting, what would the cost of perfect sampling by sampling from the past be? Presumably much higher since there is little chance a total ordering can be found on the starting states.

bandits for doubly intractable posteriors

Posted in Statistics with tags , , , , , , , , on April 17, 2019 by xi'an

Last Friday, Guanyang Wang arXived a paper on the use of multi-armed bandits (hence the reference to the three bandits) to handle intractable normalising constants. The bandit compares or mixes Møller et al. (2006) auxiliary variable solution with Murray et al. (2006) exchange algorithm. Which are both special cases of pseudo-marginal MCMC algorithms. In both cases, the auxiliary variables produce an unbiased estimator of the ratio of the constants. Rather than the ratio of two unbiased estimators as in the more standard pseudo-marginal MCMC. The current paper tries to compare the two approaches based on the variance of the ratio estimate, but cannot derive a general ordering. The multi-armed bandit algorithm exploits both estimators of the acceptance ratio to pick the one that is almost the largest, almost because there is a correction for validating the step by detailed balance. The bandit acceptance probability is the maximum [over the methods] of the minimum [over the time directions] of the original acceptance ratio. While this appears to be valid, note that the resulting algorithm implies four times as many auxiliary variates as the original ones, which makes me wonder at the gain when compared with a parallel implementation of these methods, coupled at random times. (The fundamental difficulty of simulating from likelihoods with an unknown normalising constant remains, see p.4.)

spacings on a torus

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , on March 22, 2018 by xi'an

While in Brussels last week I noticed an interesting question on X validated that I considered in the train back home and then more over the weekend. This is a question about spacings, namely how long on average does it take to cover an interval of length L when drawing unit intervals at random (with a torus handling of the endpoints)? Which immediately reminded me of Wilfrid Kendall (Warwick) famous gif animation of coupling from the past via leaves covering a square region, from the top (forward) and from the bottom (backward)…

The problem is rather easily expressed in terms of uniform spacings, more specifically on the maximum spacing being less than 1 (or 1/L depending on the parameterisation). Except for the additional constraint at the boundary, which is not independent of the other spacings. Replacing this extra event with an independent spacing, there exists a direct formula for the expected stopping time, which can be checked rather easily by simulation. But the exact case appears to be add a few more steps to the draws, 3/2 apparently. The following graph displays the regression of the Monte Carlo number of steps over 10⁴ replicas against the exact values:

unbiased MCMC

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , on August 25, 2017 by xi'an

Two weeks ago, Pierre Jacob, John O’Leary, and Yves F. Atchadé arXived a paper on unbiased MCMC with coupling. Associating MCMC with unbiasedness is rather challenging since MCMC are rarely producing simulations from the exact target, unless specific tools like renewal can be produced in an efficient manner. (I supported the use of such renewal techniques as early as 1995, but later experiments led me to think renewal control was too rare an occurrence to consider it as a generic convergence assessment method.)

This new paper makes me think I had given up too easily! Here the central idea is coupling of two (MCMC) chains, associated with the debiasing formula used by Glynn and Rhee (2014) and already discussed here. Having the coupled chains meet at some time with probability one implies that the debiasing formula does not need a (random) stopping time. The coupling time is sufficient. Furthermore, several estimators can be derived from the same coupled Markov chain simulations, obtained by starting the averaging at a later time than the first iteration. The average of these (unbiased) averages results into a weighted estimate that weights more the later differences. Although coupling is also at the basis of perfect simulation methods, the analogy between this debiasing technique and perfect sampling is hard to fathom, since the coupling of two chains is not a perfect sampling instant. (Something obvious only in retrospect for me is that the variance of the resulting unbiased estimator is at best the variance of the original MCMC estimator.)

When discussing the implementation of coupling in Metropolis and Gibbs settings, the authors give a simple optimal coupling algorithm I was not aware of. Which is a form of accept-reject also found in perfect sampling I believe. (Renewal based on small sets makes an appearance on page 11.) I did not fully understood the way two random walk Metropolis steps are coupled, in that the normal proposals seem at odds with the boundedness constraints. But coupling is clearly working in this setting, while renewal does not. In toy examples like the (Efron and Morris!) baseball data and the (Gelfand and Smith!) pump failure data, the parameters k and m of the algorithm can be optimised against the variance of the averaged averages. And this approach comes highly useful in the case of the cut distribution,  a problem which I became aware of during MCMskiii and on which we are currently working with Pierre and others.

Le Monde puzzle [#947]

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , on February 2, 2016 by xi'an

Another boardgame in Le Monde mathematical puzzle :

Given an 8×8 chequerboard,  consider placing 2×2 tiles over this chequerboard until (a) the entire surface is covered and (b) removing a single 2×2 tile exposes some of the original chequerboard. What is the maximal number of 2×2 tiles one can set according to this scheme? And for a 10×10 chequerboard?

This puzzle reminded me of Wilfrid Kendall’s introduction to perfect sampling  with leaves seen through a ceiling window falling from above, until sky was no longer visible. The representation was used by Wilfrid to explain that coupling from the past did not need to go all the way back to infinity:

Defining a random coverage of the chequerboard by those 2×2 tiles amounts to selecting a random permutation þ of 1:(n-1)² and finding the subvector of þ producing a full coverage

 path=sample(1:(n-1)^2) #random permutation
 path=path+((path-1)%/%(n-1)) #account for border shift
 while (min(grid)==0){ #all entries covered

Then removing superfluous tiles:

for (k in sample(1:i)){
 if (min(krid)>0){ #off used below
   off[k]=FALSE; grid=krid} #useless tile

And checking the remaining ones are truly essential:

for (k in (1:i)[off]){
sky=(mingrid>0) #rejection of the grid

leads to the maximum number of tiles to be [at least] M=16,25,37,54 for n=6,8,10,12…