## more of the same!

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on December 10, 2015 by xi'an

Daniel Seita, Haoyu Chen, and John Canny arXived last week a paper entitled “Fast parallel SAME Gibbs sampling on general discrete Bayesian networks“.  The distributions of the observables are defined by full conditional probability tables on the nodes of a graphical model. The distributions on the latent or missing nodes of the network are multinomial, with Dirichlet priors. To derive the MAP in such models, although this goal is not explicitly stated in the paper till the second page, the authors refer to the recent paper by Zhao et al. (2015), discussed on the ‘Og just as recently, which applies our SAME methodology. Since the paper is mostly computational (and submitted to ICLR 2016, which takes place juuust before AISTATS 2016), I do not have much to comment about it. Except to notice that the authors mention our paper as “Technical report, Statistics and Computing, 2002”. I am not sure the editor of Statistics and Computing will appreciate! The proper reference is in Statistics and Computing, 12:77-84, 2002.

“We argue that SAME is beneficial for Gibbs sampling because it helps to reduce excess variance.”

Still, I am a wee bit surprised at both the above statement and at the comparison with a JAGS implementation. Because SAME augments the number of latent vectors as the number of iterations increases, so should be slower by a mere curse of dimension,, slower than a regular Gibbs with a single latent vector. And because I do not get either the connection with JAGS: SAME could be programmed in JAGS, couldn’t it? If the authors means a regular Gibbs sampler with no latent vector augmentation, the comparison makes little sense as one algorithm aims at the MAP (with a modest five replicas), while the other encompasses the complete posterior distribution. But this sounds unlikely when considering that the larger the number m of replicas the better their alternative to JAGS. It would thus be interesting to understand what the authors mean by JAGS in this setup!

## another vanilla Rao-Blackwellisation

Posted in Statistics, University life with tags , , , , , on September 16, 2013 by xi'an

In the latest issue of Statistics and Computing (2013, Issue 23, pages 577-587), Iliopoulos and Malefaki published a paper that relates to our vanilla Rao-Blackwellisation paper with Randal Douc. The idea is to derive another approximation to the ideal importance sampling weight using the “accepted” Markov chain. (With Randal, we had a Bernoulli factory representation.) The density g(x) of the accepted chain being unknown; it is represented as the expectation under π of the function

$\min\left\{q(z|x)/\pi(z),q(x|z)/\pi(x)\right\}$

and hence estimated by a self-normalised average based on the whole Markov chain. This means the resulting importance estimate uses twice the output of the algorithm and that it is biased and of order O(n²), thus the same order as our original Rao-Blackwellised estimator (Robert & Casella, 1996)… This also means convergence and CLT are very hard to establish: the main convergence theorem of the paper holds only for finite state spaces, with a surprising smaller asymptotic variance for this self-normalised average than for the ideal importance sampling estimator in the independent Metropolis-Hastings case. (Both are biased by being self-normalised and the paper does not consider the magnitude of those biases.)

Interestingly, the authors also ran a comparison with our parallelised Rao-Blackwellised version (with Pierre Jacob and Murray Smith), but conclude (P.58) at a larger CPU (should be GPU!!) required by the parallelisation, which does not really make sense: when compared with the plain Metropolis-Hastings implementation, run on a single processor, the parallel version only requires an extra random permutation per thread or per processor. I thus suspect a faulty implementation that induces this CPU being linear in the size of the blocks, like maybe only saving one output per block… Also interestingly, the paper re-analyses the Pima Indian probit model Jean-Michel Marin and I (and many others) used as benchmark in several of our papers. As in the most standard examples, the outcome shows a mild reduction in variance when using this estimated importance sampling version. Maybe a comparison with the ideal importance sampler (i.e. the one that does not divide by the sum of the weights since using normalised versions of the target and importance densities) would have helped in the comparison.

## Special Issue of ACM TOMACS on Monte Carlo Methods in Statistics

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , on December 10, 2012 by xi'an

As posted here a long, long while ago, following a suggestion from the editor (and North America Cycling Champion!) Pierre Lécuyer (Université de Montréal), Arnaud Doucet (University of Oxford) and myself acted as guest editors for a special issue of ACM TOMACS on Monte Carlo Methods in Statistics. (Coincidentally, I am attending a board meeting for TOMACS tonight in Berlin!) The issue is now ready for publication (next February unless I am confused!) and made of the following papers:

 * Massive parallelization of serial inference algorithms for a complex generalized linear model MARC A. SUCHARD, IVAN ZORYCH, PATRICK RYAN, DAVID MADIGAN Abstract *Convergence of a Particle-based Approximation of the Block Online Expectation Maximization Algorithm SYLVAIN LE CORFF and GERSENDE FORT Abstract * Efficient MCMC for Binomial Logit Models AGNES FUSSL, SYLVIA FRÜHWIRTH-SCHNATTER, RUDOLF FRÜHWIRTH Abstract * Adaptive Equi-Energy Sampler: Convergence and Illustration AMANDINE SCHRECK and GERSENDE FORT and ERIC MOULINES Abstract * Particle algorithms for optimization on binary spaces CHRISTIAN SCHÄFER Abstract * Posterior expectation of regularly paved random histograms RAAZESH SAINUDIIN, GLORIA TENG, JENNIFER HARLOW, and DOMINIC LEE Abstract * Small variance estimators for rare event probabilities MICHEL BRONIATOWSKI and VIRGILE CARON Abstract * Self-Avoiding Random Dynamics on Integer Complex Systems FIRAS HAMZE, ZIYU WANG, and NANDO DE FREITAS Abstract * Bayesian learning of noisy Markov decision processes SUMEETPAL S. SINGH, NICOLAS CHOPIN, and NICK WHITELEY Abstract

Here is the draft of the editorial that will appear at the beginning of this special issue. (All faults are mine, of course!) Continue reading