Archive for Markov kernel

optimal choice among MCMC kernels

Posted in Statistics with tags , , , , , , , , , , on March 14, 2019 by xi'an

Last week in Siem Reap, Florian Maire [who I discovered originates from a Norman town less than 10km from my hometown!] presented an arXived joint work with Pierre Vandekerkhove at the Data Science & Finance conference in Cambodia that considers the following problem: Given a large collection of MCMC kernels, how to pick the best one and how to define what best means. Going by mixtures is a default exploration of the collection, as shown in (Tierney) 1994 for instance since this improves on both kernels (esp. when each kernel is not irreducible on its own!). This paper considers a move to local weights in the mixture, weights that are not estimated from earlier simulations, contrary to what I first understood.

As made clearer in the paper the focus is on filamentary distributions that are concentrated nearby lower-dimension sets or manifolds Since then the components of the kernel collections can be restricted to directions of these manifolds… Including an interesting case of a 2-D highly peaked target where converging means mostly simulating in x¹ and covering the target means mostly simulating in x². Exhibiting a schizophrenic tension between the two goals. Weight locally dependent means correction by Metropolis step, with cost O(n). What of Rao-Blackwellisation of these mixture weights, from weight x transition to full mixture, as in our PMC paper? Unclear to me as well [during the talk] is the use in the mixture of basic Metropolis kernels, which are not absolutely continuous, because of the Dirac mass component. But this is clarified by Section 5 in the paper. A surprising result from the paper (Corollary 1) is that the use of local weights ω(i,x) that depend on the current value of the chain does jeopardize the stationary measure π(.) of the mixture chain. Which may be due to the fact that all components of the mixture are already π-invariant. Or that the index of the kernel constitutes an auxiliary (if ancillary)  variate. (Algorithm 1 in the paper reminds me of delayed acceptance. Making me wonder if computing time should be accounted for.) A final question I briefly discussed with Florian is the extension to weights that are automatically constructed from the simulations and the target.

A chance non-typo

Posted in Books, Statistics, University life with tags , , on March 18, 2011 by xi'an

A few days ago, my colleague Medhi Dafal in the admin branch of the university came across Monte Carlo Statistical Methods on another colleague’s desk and took it with him. Yesterday morning, when I came to his office to burden him with another admin nightmare, he opened the book [at random] and pointed out a formula pretending he had trouble with it! This was the residual distribution in the Markov chapter

\tilde K(x,\cdot)=\dfrac{K(x,\cdot) - \epsilon\mathbb{I}_C(x)\nu(\cdot)}{1-\epsilon\mathbb{I}_C(x)}

When I first saw it, I though this was indeed a typo because of the indicator in the denominator. Now that I look at it more carefully, I realise this is correct, because C is a small set, hence the residual kernel is the standard kernel outside C. Still, for a few minutes, I had this weird impression he had found a new typo completely at random! Which would have been a significant item of information about the frequency of typos in this book… Phew!