Weak consistency of MCMC methods

On the flight back from Bristol, I fought with this arXiv paper of Kengo Kamatani, “Weak consistency of MCMC methods“.

We study the asymptotic behavior of non-regular, weak consistent MCMC. It is related to the question whether or not the consistency of MCMC reflects the real behavior of MCMC. We apply weak consistency to a simple mixture model. There is a natural Gibbs sampler which works poorly in simulation. As an alternative, we propose a Metropolis-Hastings (MH) algorithm. We show that the MH algorithm is consistent and the Gibbs sampler is not but weakly consistent. Key fact is that the two MCMC processes tends to a diffusion process and an AR process with respectively. These results come from the weak convergence property of MCMC which is difficult to obtain from Harris recurrence approach.

I found the paper quite hard to read and even focusing on the mixture application did not give me a proper sense of what “consistency” and “works poorly” mean. Part of the difficulty lies with the maths, as the author considers the asymptotic behaviour of the whole (MCMC) Markov chain when the number of observations (in the mixture) goes to infinity. (Another part comes from confusing and multiple-meaning notations, including typos as in the definition 2.1 of weak consistency and on Lemma 3.3.) Anyway,  the paper establishes [in a way I simply cannot check!] a dominance (consistency versus weak-consistency) of an independent Metropolis-Hastings algorithm over the regular Gibbs sampler for mixtures where only the weight is unknown… If I understand correctly the paper, the problem with the Gibbs sampler (which is uniformly ergodic in this setting, even though convergence may be slow as discussed in our paper with Jim Hobert and Vivek Roy) is that it is not consistent on the boundaries, namely when the weight is either zero or one.  The paper concludes with a numerical comparison between the MSEs of Gibbs and Metropolis-Hastings estimators of the Bayes estimator. (It is unclear how this true Bayes estimator is obtained as I am not aware of a manageable formula for large sample sizes.) This comparison does not seem to relate to the earlier theoretical developments, as this error is connected with the autocorrelation in the Markov chain, not with its convergence properties… I would thus be quite interested in other readers’ comments on this rather puzzling paper.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.