Archive for GPU

resampling and [GPU] parallelism

Posted in Statistics, University life with tags , , , , , , on March 13, 2012 by xi'an

In a recent note posted on arXiv, Lawrence Murray compares the implementation of resampling schemes for parallel systems like GPUs. Given a system of weighted particles, (xii), there are several ways of drawing a sample according to those weights:

  1. regular multinomial resampling, where each point in the (new) sample is one of the (xii), with probability (xii), meaning there is a uniform generated for each point;
  2. stratified resampling, where the weights are added, divided into equal pieces and a uniform is sampled on each piece, which means that points with large weights are sampled at least once and those with small weights at most once;
  3. systematic resampling, which is the same as the above except that the same uniform is used for each piece,
  4. Metropolis resampling, where a Markov chain converges to the distribution (ω1,…, ωP) on {1,…,P},

The three first resamplers are common in the particle system literature (incl. Nicolas Chopin’s PhD thesis), but difficult to adapt to GPUs (and I always feel uncomfortable with the fact that systematic uses a single uniform!), while the last one is more unusual, but actually well-fitted for a parallel implementation. While Lawrence Murray suggests using Raftery and Lewis’ (1992) assessment of the required number of Metropolis iterations to “achieve convergence”, I would instead suggest taking advantage of the toric nature of the space (as represented above) to run a random walk and wait for the equivalent of a complete cycle. In any case, this is a cool illustration of the new challenges posed by parallel implementations (like the development of proper random generators).

GPUs in computational statistics

Posted in pictures, Statistics, Travel with tags , , , , , , , on January 27, 2012 by xi'an

The workshop in Warwick yesterday went on very quickly! The room was packed. The three first talks were by myself, Christophe and Pierre, so less about GPUs and more about simulation techniques which could benefit from or even require implementation on GPUS. (I did manage to have complete slides this time… More seriously, Christophe’s talk set me thinking on the issue of estimating the likelihood function in ways different (?) from the one used in ABC.) The second half was more novel for me, in that the three talks went into the computing and computer aspects of GPUS, with Chris Holmes doing sparse [Lasso-like] regression on a scale otherwise [i.e. w/o GPUs] impossible, Chris [fourth Chris in the list of speakers!] Barnes explaining ABC for molecular biology and design (a point I plan to discuss on a later post), with even more details about the architecture and programming of GPUs, and Michael Stumpf delivering a grand finale, with essentially three talks into one: network analysis (incl. terrific movie bits incorporated within the beamer slides!), GPUs vs. CPUs and older alternatives, and random generators on GPU, commenting on a recent paper by Salmon et al. (SC, 2011) and showing that true gains in efficiency from using GPUs involved a heavy involvement into the hardware structure… A very exciting day followed by Stilton cheese degustation and haggis (if not poems) to celebrate Burns’ night!

Some hae meat and canna eat,
And some wad eat that want it;
But we hae meat, and we can eat,
And sae let the Lord be thankit.

English trip (1)

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , on January 25, 2012 by xi'an

Today, I am attending a workshop on the use of graphics processing units in Statistics in Warwick, supported by CRiSM, presenting our recent works with Randal Douc, Pierre Jacob and Murray Smith. (I will use the same slides as in Telecom two months ago, hopefully avoiding the loss of integral and summation signs this time!) Pierre Jacob will talk about Wang-Landau.

Then, tomorrow, I am off to Cambridge to talk about ABC and model choice on Friday afternoon. (Presumably using the same slides as in Provo.)

The (1) in the title is in prevision of a second trip to Oxford next month and another one to Bristol two months after! (The trip to Edinburgh does not count of course, since it is in Scotland!)

GPUs in Computational Statistics [Warwick, Jan. 25]

Posted in Statistics, Travel, University life with tags , , , , , , , on January 6, 2012 by xi'an

Next January 25, I will take part in a workshop at the University of Warwick, (organised by CRiSM and CSC) on the theme of GPUs in Computational Statistics. Even though I have not directly worked on GPUs, I will talk about our joint work with Pierre Jacob and Murray Smith.  While Pierre will talk about Parallel Wang-Landau. From there I will travel to Cambridge for a seminar on ABC model choice the next Friday.

Parallel processing of independent Metropolis-Hastings algorithms

Posted in R, Statistics, University life with tags , , , , , , , , on October 12, 2010 by xi'an

With Pierre Jacob, my PhD student, and Murray Smith, from National Institute of Water and Atmospheric Research, Wellington, who actually started us on this project at the last and latest Valencia meeting, we have completed a paper on using parallel computing in independent Metropolis-Hastings algorithms. The paper is arXived and the abstract goes as follows:

In this paper, we consider the implications of the fact that parallel raw-power can be exploited by a generic Metropolis–Hastings algorithm if the proposed values are independent. In particular, we present improvements to the independent Metropolis–Hastings algorithm that significantly decrease the variance of any estimator derived from the MCMC output, for a null computing cost since those improvements are based on a fixed number of target density evaluations. Furthermore, the techniques developed in this paper do not jeopardize the Markovian convergence properties of the algorithm, since they are based on the Rao–Blackwell principles of Gelfand and Smith (1990), already exploited in Casella and Robert 91996), Atchadé and Perron (2005) and Douc and Robert (2010). We illustrate those improvement both on a toy normal example and on a classical probit regression model but insist on the fact that they are universally applicable.

I am quite excited about the results in this paper, which took advantage of (a) older works of mine on Rao-Blackwellisation, (b) Murray’s interests in costly likelihoods, and (c) our mutual excitement when hearing about GPU parallel possibilities from Chris Holmes’ talk in Valencia. (As well as directions drafted in an exciting session in Vancouver!) The (free) gains over standard independent Metropolis-Hastings estimates are equivalent to using importance sampling gains, while keeping the Markov structure of the original chain. Given that 100 or more parallel threads can be enhanced from current GPU cards, this is clearly a field with much potential! The graph below

gives the variance improvements brought by three Rao-Blackwell estimates taking advantage of parallelisation over the initial MCMC estimate (first entry) with the importance sampling estimate (last entry) using only 10 parallel threads.

Follow

Get every new post delivered to your Inbox.

Join 340 other followers