## bounded target support [#2]

In a sort of echo from an earlier post, I received this emailed question from Gabriel:

I am contacting you in connection with my internship and your book «Le choix bayésien» where I cannot find an answer to my question. Given a constrained parameter space and an unconstrained Markov chain, is it correct to subsample the chain in order to keep only those points that satisfy the constraint?

To which I replied that this would induce a bias in the outcome, even though this is a marginally valid argument (if  the Markov chain is in its stationary regime, picking one value at random from those satisfying the constraint is akin to accept-reject). The unbiased approach is to resort to Metropolis-Hastings steps in Gabriel’s Gibbs sampler to check whether or not each proposed move stays within the constrained space. If not, one need to replicate the current value of the chain…

Update: Following comments by Ajay and David, I withdraw the term “bias”. The method works as a low key accept-reject but can be very inefficient, to the extent of having no visit to the constrained set. (However, in the event of a practically/numerically disconnected support with a huge gap between the connected components, it may also be more efficient than a low energy Metropolis-Hastings algorithm. Mileage may vary!)

### 10 Responses to “bounded target support [#2]”

1. Hi Prof. Robert:

As you mentioned in the post:
“it may also be more efficient than a low energy Metropolis-Hastings algorithm”.
Actually, I did come across this kind of test results.
But, I very confused and curious about the theoretical basis for it being more efficient?

Best.
Thanks.

• Imposing the constraint by a specifically designed proposal may slow down the exploration of the support of the distribution in some cases, while a broader support for the proposal has a higher potential to reach faraway regions in particular when the support is not connected. I do not have a particular theoretical result in mind connected with this intuition.

• Very helpful. Thanks a lot!

2. Dear Prof. Robert:

Thanks for this post, which is very helpful to me.

I study MCMC rendering algorithms in computer graphics.
I met the situation that excluding the samples, which are impossible values of the stationary of Markov Chain, makes algorithms more efficient.
Considering that the probability of proposals that are impossible values (I mean, out of the bounded target support) is over 50% in MCMC rendering algorithms, which causes rendering inefficient, I believe that something needs to be done to counteract this situation, so that we can improve rendering efficiency.
But, you said that “there is no mathematical reason for doing so!”

I don’t quite understand what exactly the “mathematical reason” is.
I wonder if there is anything can be done to provide remedies for impossible values/proposals?

By the way, sample space of rendering algorithm is infinite-dimension.

Thanks.
Best.
Libing Zeng

• I wrote mathematical when meaning that proposing values outside the support of the target does not invalidate the Markov chain as converging to the true target. No mention is made of efficiency in this statement. If a manageable proposal restricted to the exact support is available and improves convergence there is no reason not to use it.

Yes, you’re right.
It does converge to the true target.

One of the major research directions of MCMC rendering algorithm is to find better proposal strategies, so that to reduce the percentage of proposals outside the support of the target.
Though lots of efforts have been put, the percentage is still larger than 50%.

So, If we can find some variations of M-H sampling to reduce the negative impact on efficiency, things could be better.
I know M-H with delayed rejection does help.

From the perspective of efficiency, at least for MCMC rendering algorithms, if we can do anything as remedies for the bad proposals?

Best.

3. Dear Christian,

Thanks for this post, very interesting way to start the day for me!

Although this is not quite the answer to the question (and perhaps exactly what you say at the begining of your answer – apologies if it is), you could base estimates on the basis of the samples in the right set (lets call it A).

If one adopts the estimate, for a one-dimensional test function $\varphi$ and N MCMC samples:

$\frac{\sum_{i=1}^N \varphi(X_i)\mathbb{I}_A(X_i)}{\sum_{i=1}^N \mathbb{I}_A(X_i)}$

Under minimal assumptions and for an ergodic chain (for a probability $\pi$), this would converge to

$\pi(\varphi\mathbb{I}_A)/\pi(\mathbb{I}_A).$

Perhaps this is the point you are making at the beginning of your response, however.

This is not quite the question, but, if one wanted just to use the sub-samples for estimating expectations w.r.t. $\pi$ you could do so in an asymptotically consistent sense. This `idea’ however, does not say anything about the variance of the estimate or if its as good as the approach you suggest!

Thanks,

Ajay

4. I don’t see where this induces a bias. Are you saying that if a chain produces samples from f defined on S and T is a subset of S, then those samples restricted to T are not unbiased on f restricted to T?

• Take as a target the standard normal distribution restricted to [3,4]: a regular MCMC will not visit this interval unless the target is also restricted. This is not a bias, granted, but the algorithm does not work.

• Right, I agree with that, it can be very inefficient especially when the target is in the tails of the original distribution.

But it can be a sensible approach if one is unable or unwilling to change the code for the original sampler and the target has reasonable coverage.

This site uses Akismet to reduce spam. Learn how your comment data is processed.