Archive for cloud computing

Parallel processing of independent Metropolis-Hastings algorithms

Posted in R, Statistics, University life with tags , , , , , , , , on October 12, 2010 by xi'an

With Pierre Jacob, my PhD student, and Murray Smith, from National Institute of Water and Atmospheric Research, Wellington, who actually started us on this project at the last and latest Valencia meeting, we have completed a paper on using parallel computing in independent Metropolis-Hastings algorithms. The paper is arXived and the abstract goes as follows:

In this paper, we consider the implications of the fact that parallel raw-power can be exploited by a generic Metropolis–Hastings algorithm if the proposed values are independent. In particular, we present improvements to the independent Metropolis–Hastings algorithm that significantly decrease the variance of any estimator derived from the MCMC output, for a null computing cost since those improvements are based on a fixed number of target density evaluations. Furthermore, the techniques developed in this paper do not jeopardize the Markovian convergence properties of the algorithm, since they are based on the Rao–Blackwell principles of Gelfand and Smith (1990), already exploited in Casella and Robert 91996), Atchadé and Perron (2005) and Douc and Robert (2010). We illustrate those improvement both on a toy normal example and on a classical probit regression model but insist on the fact that they are universally applicable.

I am quite excited about the results in this paper, which took advantage of (a) older works of mine on Rao-Blackwellisation, (b) Murray’s interests in costly likelihoods, and (c) our mutual excitement when hearing about GPU parallel possibilities from Chris Holmes’ talk in Valencia. (As well as directions drafted in an exciting session in Vancouver!) The (free) gains over standard independent Metropolis-Hastings estimates are equivalent to using importance sampling gains, while keeping the Markov structure of the original chain. Given that 100 or more parallel threads can be enhanced from current GPU cards, this is clearly a field with much potential! The graph below

gives the variance improvements brought by three Rao-Blackwell estimates taking advantage of parallelisation over the initial MCMC estimate (first entry) with the importance sampling estimate (last entry) using only 10 parallel threads.

JSM 2010 [day 1]

Posted in R, Statistics, University life with tags , , , , , , , , , , on August 2, 2010 by xi'an

The first day at JSM is always a bit sluggish, as people slowly drip in and get their bearings. Similar to last year in Washington D.C., the meeting takes place in a huge conference centre and thus there is no feeling of overcrowded [so far]. It may also be that the peripheric and foreign location of the meeting put some regular attendees off (not to mention the expensive living costs!).

Nonetheless, the Sunday afternoon sessions started with a highly interesting How Fast Can We Compute? How Fast Will We Compute? session organised by Mike West and featuring Steve Scott, Mark Suchard and Qanli Wang. The topic was on parallel processing, either via multiple processors or via GPUS, the later relating to the exciting talk Chris Holmes gave at the Valencia meeting. Steve showed us some code in order to explain how feasible the jump to parallel programming—a point demonstrated by Julien Cornebise and Pierre Jacob after they returned from Valencia—was, while stressing the fact that a lot of the processing in MCMC runs was opened to parallelisation. For instance, data augmentation schemes can allocate the missing data in a parallel way in most problems and the same for independent data likelihood computation. Marc Suchard focussed on GPUs and phylogenetic trees, both of high interest to me!, and he stressed the huge gains—of the order of hundreds in the decrease in computing time—made possible by the exploitation of laptop [Macbook] GPUs. (If I got his example correctly, he seemed to be doing an exact computation of the phylogeny likelihood, not an ABC approximation… Which is quite interesting, if potentially killing one of my main areas of research!) Qanli Wang linked both previous with the example of mixtures with a huge number of components. Plenty of food for thought.

I completed the afternoon session with the Student Paper Competition: Bayesian Nonparametric and Semiparametric Methods which was discouragingly empty of participants, with two of the five speakers missing and less than twenty people in the room. (I did not get the point about the competition as to who was ranking those papers. Not the participants apparently!)

If it ain’t broken, don’t fix it!

Posted in Linux with tags , , , , , , , , on September 6, 2009 by xi'an

I tend to minimise the instances when I have to change of Linux version on my computer (a Macbook Pro Santa Rosa), for this always is a traumatic event with potential consequences on my work, and therefore I had postponed leaving Kubuntu 7.10 (Gutsy Gibbon) in a typical procrastinating attitude,  i.e. till something stopped working! This happened yesterday with a failed attempt at upgrading R and I thus moved to Kubuntu 8.04 (Hardy Heron), using the upgrade option rather than re-installing the whole thing from scratch. The only manual steps were to change xorg.conf to accomodate the touchpad and to install madwif to get access to the wireless—the most frustrating part was finding a valid address! (Upgrading Ubuntu means that each intermediate version needs to be installed before moving to the next.) Now that everything works nicely (including suspend and hibernate, and my earlier heating problem seems to be resolved as well!), I am wondering whether or not I should move to the next version of Kubuntu 8.10 (Intrepid Ibex, terrible tee-shirt!) or even to the most recent  Kubuntu 9.04 (Jaunty Jackalope, terrible name: jaundiced jackal, jovial jabiru, jocular jellyfish, or jittery jaguar would have sounded way better! The next animal in the list of releases is a koala, although karmic would hardly have been my adjective of choice)… As an aside about this future release, an interesting announcement is that Ubuntu 9.10 would support cloud computing.