**W**hen playing with Peter Rossi’s bayesm R package during a visit of Jean-Michel Marin to Paris, last week, we came up with the above Gibbs outcome. The setting is a Gaussian mixture model with three components in dimension 5 and the prior distributions are standard conjugate. In this case, with 500 observations and 5000 Gibbs iterations, the Markov chain (for one component of one mean of the mixture) has two highly distinct regimes: one that revolves around the true value of the parameter, 2.5, and one that explores a much broader area (which is associated with a much smaller value of the component weight). What we found amazing is the Gibbs ability to entertain both regimes, simultaneously.

## Archive for Markov chain Monte Carlo

## amazing Gibbs sampler

Posted in Books, pictures, R, Statistics, University life with tags bayesm, convergence assessment, Gibbs sampler, Jean-Michel Marin, Markov chain Monte Carlo, mixtures, R on February 19, 2015 by xi'an## this issue of Series B

Posted in Books, Statistics, Travel, University life with tags bag of little bootstraps, Bayesian bridge, Bayesian lasso, JRSSB, marginal likelihood, Markov chain Monte Carlo, normalising constant, Series B, simulation, untractable normalizing constant, Wasserman's paradox on September 5, 2014 by xi'an**T**he September issue of [JRSS] Series B I received a few days ago is of particular interest to me. (And not as an ex-co-editor since I was never involved in any of those papers!) To wit: a paper by Hani Doss and Aixin Tan on evaluating normalising constants based on MCMC output, a preliminary version I had seen at a previous JSM meeting, a paper by Nick Polson, James Scott and Jesse Windle on the Bayesian bridge, connected with Nick’s talk in Boston earlier this month, yet another paper by Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar and Michael Jordan on the bag of little bootstraps, which presentation I heard Michael deliver a few times when he was in Paris. (Obviously, this does not imply any negative judgement on the other papers of this issue!)

For instance, Doss and Tan consider the multiple mixture estimator [my wording, the authors do not give the method a name, referring to Vardi (1985) but missing the connection with Owen and Zhou (2000)] of k ratios of normalising constants, namely

where the z’s are the normalising constants and with possible different numbers of iterations of each Markov chain. An interesting starting point (that Hans Künsch had mentioned to me a while ago but that I had since then forgotten) is that the problem was reformulated by Charlie Geyer (1994) as a quasi-likelihood estimation where the ratios of all z’s relative to one reference density are the unknowns. This is doubling interesting, actually, because it restates the constant estimation problem into a statistical light and thus somewhat relates to the infamous “paradox” raised by Larry Wasserman a while ago. The novelty in the paper is (a) to derive an optimal estimator of the ratios of normalising constants in the Markov case, essentially accounting for possibly different lengths of the Markov chains, and (b) to estimate the variance matrix of the ratio estimate by regeneration arguments. A favourite tool of mine, at least theoretically as practically useful minorising conditions are hard to come by, if at all available.

## new MCMC algorithm for Bayesian variable selection

Posted in pictures, Statistics, Travel, University life with tags Bayesian model choice, Bayesian variable selection, Hastings-Metropolis sampler, Langevin diffusion, Langevin MCMC algorithm, Markov chain Monte Carlo, Monte Carlo Statistical Methods, shrinkage estimation, simulation, variable dimension models on February 25, 2014 by xi'an**U**nfortunately, I will miss the incoming Bayes in Paris seminar next Thursday (27th February), as I will be flying to Montréal and then Québec at the time (despite having omitted to book a flight till now!). Indeed Amandine Shreck will give a talk at 2pm in room 18 of ENSAE, Malakoff, on *A shrinkage-thresholding Metropolis adjusted Langevin algorithm for Bayesian variable selection*, a work written jointly with Gersende Fort, Sylvain Le Corff, and Eric Moulines, and arXived at the end of 2013 (which may explain why I missed it!). Here is the abstract:

This paper introduces a new Markov Chain Monte Carlo method to perform Bayesian variable selection in high dimensional settings. The algorithm is a Hastings-Metropolis sampler with a proposal mechanism which combines (i) a Metropolis adjusted Langevin step to propose local moves associated with the differentiable part of the target density with (ii) a shrinkage-thresholding step based on the non-differentiable part of the target density which provides sparse solutions such that small components are shrunk toward zero. This allows to sample from distributions on spaces with different dimensions by actually setting some components to zero. The performances of this new procedure are illustrated with both simulated and real data sets. The geometric ergodicity of this new transdimensional Markov Chain Monte Carlo sampler is also established.

(I will definitely get a look at the paper over the coming days!)

## sampling from time-varying log-concave distributions

Posted in Statistics, University life with tags convexity, log-concave functions, Markov chain Monte Carlo, random walk, sequential Monte Carlo, Wharton Business School on October 2, 2013 by xi'an**S**asha Rakhlin from Wharton sent me this paper he wrote (and arXived) with Hariharan Narayanan on a specific Markov chain algorithm that handles sequential Monte Carlo problems for log-concave targets. By relying on novel (by my standards) mathematical techniques, they manage to obtain geometric ergodicity results for random-walk based algorithms and log-concave targets. One of the new tools is the notion of self-concordant barrier, a sort of convex potential function ** F** associated with a reference convex set and with Lipschitz properties. The second tool is a Gaussian distribution based on the metric induced by

**. The third is the Dikin walk Markov chain, which uses this Gaussian as proposal and moves almost like the Metropolis-Hastings algorithm, except that it rejects with at least a probability of ½. The scale (or step size) of the Gaussian proposal is determined by the regularity of the log-concave target. In that setting, the total variation distance between the target at the**

*F**t*-th level and the distribution of the Markov chain can be fairly precisely approximated. Which leads in turn to a scaling of the number of random walk steps that are necessary to ensure convergence. Depending on the pace of the moving target, a single step of the random walk may be sufficient, which is quite an interesting feature.

## intrinsic quantity for a Markov chain?

Posted in Statistics with tags þ, bootstrap, convergence assessment, CREST, effective sample size, Gainesville, Markov chain Monte Carlo, renewal process on February 6, 2013 by xi'an**I** was attending a lecture this morning at CREST by Patrice Bertail where he was using estimated renewal parameters on a Markov chain to build (asymptotically) convergent bootstrap procedures. Estimating renewal parameters is obviously of interest in MCMC algorithms as they can be used to assess the convergence of the associated Markov chain: That is, if the estimation does not induce a significant bias. Another question that came to me during the talk is that; since those convergence assessments techniques are formally holding for any small set, choosing the small set in order to maximise the renewal rate also maximises the number of renewal events and hence the number of terms in the control sequence: Thus, the maximal renewal rate þ is definitely a quantity of interest: Now, is this quantity þ an intrinsic parameter of the chain, i.e. a quantity that drives its mixing and/or converging behaviour(s)? For instance; an iid sequence has a renewal rate of 1; because the whole set is a “small” set. Informally, the time between two consecutive renewal events is akin to the time between two simulations from the target and stationary distribution, according to the Kac’s representation we used in our AAP paper with Jim Hobert. So it could be that þ is directly related with the effective sample size of the chain, hence the autocorrelation. *(A quick web search did not produce anything relevant:)* Too bad this question did not pop up last week when I had the opportunity to discuss it with Sean Meyn in Gainesville!

## Reading classics (#5)

Posted in Books, Statistics, University life with tags Biometrika, David Cox, Hastings, Markov chain Monte Carlo, Metropolis-Hastings, Mike Titterington, Monte Carlo Statistical Methods, numerical integration, Peskun ordering, simulation on December 14, 2012 by xi'an**T**his week, my student Dona Skanji gave a presentation of the paper of Hastings “Monte Carlo sampling methods using Markov chains and their applications“, which set the rules for running MCMC algorithms, much more so than the original paper by Metropolis et al. ~~which presented an optimisation device~~. even though the latter clearly stated the Markovian principle of those algorithms and their use for integration. (This is definitely a classic, selected in the book *Biometrika: One hundred years*, by Mike Titterington and David Cox.) Here are her slides (the best Beamer slides so far!):

**G**iven that I had already taught my lectures on Markov chains and on MCMC algorithms, the preliminary part of Dona’s talk was easier to compose and understanding the principles of the method was certainly more straightforward than for the other papers in the series. I think she nonetheless did a rather good job in summing up the paper, running this extra simulation for the Poisson distribution—with the interesting “mistake” of including the burnin time in the representation of the output and concluding about a poor convergence—and mentioning the Gibbs extension.I led the discussion of the seminar towards irreducibility conditions and Peskun’s ordering of Markov chains, which maybe could have been mentioned by Dona since she was aware Peskun was Hastings‘ student.