Archive for parallelisation

MCMSki IV [day 2.5]

Posted in Mountains, pictures, Statistics, University life with tags , , , , , , , , , on January 8, 2014 by xi'an

ridge4Despite a good rest during the ski break, my cold did not get away (no magic left in this world!) and I thus had a low attention span to attend the Bayesian statistics and Population genetics session: while Jukka Corander mentioned the improvement brought by our AMIS algorithm, I had difficulties getting the nature of the model, if only because he used a blackboard-like font that made math symbols too tiny to read. (Nice fonts, otherwise!), Daniel Lawson (of vomiting Warhammer fame!) talked about the alluring notion of a statistical emulator, and Barbara Engelhardt talked about variable selection in a SNP setting. I did not get a feeling on how handling ten millions of SNPs was possible in towards a variable selection goal.  My final session of the day was actually “my” invited session on ABC methods, where Richard Everitt presented a way of mixing exact approximation with ABC and synthetic likelihood (Wood, Nature) approximations. The resulting MAVIS algorithm is  not out yet. The second speaker was Ollie Ratman, who spoke on his accurate ABC that I have discussed many times here. And Jean-Michel Marin managed to drive from Montpelier, just in time to deliver his talk on our various explorations of the ABC model choice problem.

After a quick raclette at “home”, we headed back to the second poster session, where I had enough of a clear mind and not too much of a headache (!) to have several interesting discussions, incl. a new parallelisation suggested  by Ben Calderhead, the sticky Metropolis algorithm of Luca Martino, the airport management video of Jegar Pitchforth, the mixture of Dirichlet distributions for extremes by Anne Sabourin, not mentioning posters from Warwick or Paris. At the end of the evening  I walked back to my apartment with the Blossom skis we had brought in the morning to attract registrations for the ski race: not enough to make up for the amount charged by the ski school. Too bad, especially given Anto’s efforts to get this amazing sponsoring!

O’Bayes 2013 [#3]

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , on December 23, 2013 by xi'an

IMG_2223A final day for this O’Bayes 2013 conference, where I missed the final session for travelling reasons. Several talks had highly attractive features (for me), from David Dunson’s on his recently arXived paper on parallel MCMC, that provides an alternative to the embarrassingly parallel algorithm I discussed a few weeks ago, to be discussed further in a future post, to Marty Wells hindered by poor weather and delivering by phone a talk on L1 shrinkage estimators (a bit of a paradox since, as discussed by Yuzo Maruyama, most MAP estimators cannot be minimax and, more broadly, since they cannot be expressed as resolutions of loss minimisation), to Malay Ghosh revisiting g-priors from an almost frequentist viewpoint,  to Gonzalo Garci-Donato presenting criteria for objective Bayesian model choice in a vision that was clearly the closest to my own perspective on the topic. Overall, when reflecting upon the diversity and high quality of the talks at this O’Bayes meeting, and also as the incoming chair-elect of the corresponding section of ISBA, I think that what emerges most significantly from those talks is an ongoing pondering on the nature of (objective Bayesian) testing, not only in the works extending the g-priors in various directions, but also in the whole debate between Bayes factors and information criteria, model averaging versus model selection. During the discussion on Gonzalo’s talk, David Draper objected to the search for an automated approach to the comparison of models, but I strongly lean towards Gonzalo’s perspective as we need to provide a reference solution able to tackle less formal and more realistic problems. I do hope to see more of those realistic problems tackled at O’Bayes 2015 (which location is not yet settled). In the meanwhile, a strong thank you! to the local organising committee and most specifically to Jim Berger!

Asymptotically Exact, Embarrassingly Parallel MCMC

Posted in Books, Statistics, University life with tags , , , , , , on November 26, 2013 by xi'an

Departamento de Matemática, Universidad Complutense de Madrid, 11/11/11Willie Neiswanger, Chong Wang, and Eric Xing (from CMU) recently arXived a paper entitled as above. The “embarrassing” in the title refers to the “embarrassingly simple” solution proposed therein, namely to solve the difficulty in handling very large datasets by running completely independent parallel MCMC samplers on parallel threads or computers and using the outcomes of those samplers as density estimates, pulled together as a product towards an approximation of the true posterior density. In other words, the idea is to break the posterior as

p(\theta|x) = \prod_{i=1}^m p_i(\theta|x)

and to use the estimate

\hat p(\theta|x) = \prod_{i=1}^m \hat p_i(\theta|x)

where the individual estimates are obtained by, say, non-parametric estimates. The method is then “asymptotically exact” in the weak (and unsurprising) sense of being converging in the number of MCMC iterations. Still, there is a theoretical justification that is not found in previous parallel methods that mixed all resulting samples without accounting for the subsampling. And I also appreciate the point that, in many cases, running MCMC samplers with subsamples produces faster convergence.

In the paper, the division of p into its components is done by partitioning the iid data into m subsets. And taking a power 1/m of the prior in each case. (Which may induce improperness issues.) However, the subdivision is arbitrary and can thus be implemented in other cases than the fairly restrictive iid setting. Because each (subsample)  non-parametric estimate involves T terms, the resulting overall estimate contains Tm terms and the authors suggest using an independent Metropolis-within-Gibbs sampler to handle this complexity. Which is necessary [took me a while to realise this!] for producing a final sample from the (approximate) true posterior distribution. As an aside, I wonder why the bandwidths are all equal across all subsamples, as they should depend on those. And as it would not make much of a difference. It would also be interesting to build a typology of cases where subsampling leads to subposteriors that are close to orthogonal, preventing the implementation of the method.

As it happened, I read this paper on the very day Nial Friel (University College Dublin) gave a seminar at the Big’MC seminar on the convergence of approximations to ergodic transition kernels, based on the recent results of Mitrophanov on the stability of Markov chains, where he introduced the topic by the case of datasets large enough to prevent the computation of the likelihood function.

Special Issue of ACM TOMACS on Monte Carlo Methods in Statistics

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , on December 10, 2012 by xi'an

As posted here a long, long while ago, following a suggestion from the editor (and North America Cycling Champion!) Pierre Lécuyer (Université de Montréal), Arnaud Doucet (University of Oxford) and myself acted as guest editors for a special issue of ACM TOMACS on Monte Carlo Methods in Statistics. (Coincidentally, I am attending a board meeting for TOMACS tonight in Berlin!) The issue is now ready for publication (next February unless I am confused!) and made of the following papers:

* Massive parallelization of serial inference algorithms for a complex generalized linear model
MARC A. SUCHARD, IVAN ZORYCH, PATRICK RYAN, DAVID MADIGAN
*Convergence of a Particle-based Approximation of the Block Online Expectation Maximization Algorithm
SYLVAIN LE CORFF and GERSENDE FORT
* Efficient MCMC for Binomial Logit Models
AGNES FUSSL, SYLVIA FRÜHWIRTH-SCHNATTER, RUDOLF FRÜHWIRTH
* Adaptive Equi-Energy Sampler: Convergence and Illustration
AMANDINE SCHRECK and GERSENDE FORT and ERIC MOULINES
* Particle algorithms for optimization on binary spaces
CHRISTIAN SCHÄFER
* Posterior expectation of regularly paved random histograms
RAAZESH SAINUDIIN, GLORIA TENG, JENNIFER HARLOW, and DOMINIC LEE
* Small variance estimators for rare event probabilities
MICHEL BRONIATOWSKI and VIRGILE CARON
* Self-Avoiding Random Dynamics on Integer Complex Systems
FIRAS HAMZE, ZIYU WANG, and NANDO DE FREITAS
* Bayesian learning of noisy Markov decision processes
SUMEETPAL S. SINGH, NICOLAS CHOPIN, and NICK WHITELEY

Here is the draft of the editorial that will appear at the beginning of this special issue. (All faults are mine, of course!) Continue reading

multiple try/point Metropolis algorithm

Posted in Statistics, Travel with tags , , , , , on January 23, 2012 by xi'an

Among the arXiv documents I printed at the turn of the year in order to get a better look at them (in the métro if nowhere else!), there were two papers by Luca Martino and co-authors from Universidad Carlos III, Madrid, A multi-point Metropolis scheme with generic weight functions and Different acceptance functions for multiple try Metropolis schemes. The multiple-try algorithm sounds like another version of the delayed sampling algorithm of Tierney and Mira (1999) and Green and Mira (2001). I somehow missed it, even though it was introduced in Liu et al. (2000) and Quin and Liu (2001). Multiple-try Metropolis builds upon the idea that, instead of making one proposal at a time, it is feasible to build a sequence of proposals and to pick one among those, presumably rather likely and hence more open to being accepted. The sequence of proposals may depend upon the past propositions as well as on the current value, lending some degree of adaptability to the scheme. In the current implementation, the algorithm remains rather clumsy [in my opinion] in that (a) a fixed horizon N need be fixed and (b) an additional series of backward simulations need be produced simply to keep the balance equation happy… Hence a total number of simulations O(N) for one possible acceptance. The first note slightly extends Quin and Liu (2001) by using a fairly general weighting scheme. The second paper studies some particular choices for the weights in a much less adaptive scheme (where parallelisation would be an appropriate alternative, since each proposal in the multiple try only depends on the current value of the chain). But it does not demonstrate a more efficient behaviour than when using a cycle or a mixture of Metropolis-Hastings algorithms. The method seems to regain popularity, though, as Roberto Casarin, Radu Craiu and Fabrizio Leisen (also from Carlos III)  arXived a paper on a multiple-try algorithm, connected with population Monte Carlo, and more recently published it in Statistics and Computing.

2012, Turing year

Posted in R, Statistics, University life with tags , , , , , , on January 3, 2012 by xi'an

Buying the special issue of La Recherche on “La révolution des mathématiques”, I discovered that this is the Alan Turing Year in celebration of the 100th anniversary of Turing‘s birth. The math department at the University of Leeds has a webpage on all the events connected with this celebration. From all over the World. (There is even a Turing relay in Cambridge, unfortunately it is not open to the general public… Unless you are attending the Isaac Newton Institute at the time.) Quite fitting a tribute. (Given Turing’s contributions to Bayesian analysis, as depicted e.g. in the theory that would not die, ISBA could have included a special session in the ISBA 2012 meeting in Kyoto. I will certainly dedicate the session I co-organise there on parallel computing to his memory.)

JCGS 20th anniversary

Posted in R, Statistics, University life with tags , , , on March 22, 2011 by xi'an

For its 20th anniversary, JCGS offers free access to papers, including Andrew’s discussion paper Why tables are really much better than graphs. (Another serious ending for an April fool joke!) Incidentally (or rather coincidentally), I received today the great news that our Using parallel computation to improve Independent Metropolis-Hastings based estimation paper is accepted by JCGS. (First accepted paper ever for my PhD student Pierre!) Maybe making it into one of the 20th anniversary issues!