Archive for particle MCMC

a week in Warwick

Posted in Books, Kids, Running, Statistics, University life with tags , , , , , , , , , , , , on October 19, 2014 by xi'an

Canadian geese, WarwickThis past week in Warwick has been quite enjoyable and profitable, from staying once again in a math house, to taking advantage of the new bike, to having several long discussions on several prospective and exciting projects, to meeting with some of the new postdocs and visitors, to attending Tony O’Hagan’s talk on “wrong models”. And then having Simo Särkkä who was visiting Warwick this week discussing his paper with me. And Chris Oates doing the same with his recent arXival with Mark Girolami and Nicolas Chopin (soon to be commented, of course!). And managing to run in dry conditions despite the heavy rains (but in pitch dark as sunrise is now quite late, with the help of a headlamp and the beauty of a countryside starry sky). I also evaluated several students’ projects, two of which led me to wonder when using RJMCMC was appropriate in comparing two models. In addition, I also eloped one evening to visit old (1977!) friends in Northern Birmingham, despite fairly dire London Midlands performances between Coventry and Birmingham New Street, the only redeeming feature being that the connecting train there was also late by one hour! (Not mentioning the weirdest taxi-driver ever on my way back, trying to get my opinion on whether or not he should have an affair… which at least kept me awake the whole trip!) Definitely looking forward my next trip there at the end of November.

Bayesian inference for low count time series models with intractable likelihoods

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , on January 21, 2014 by xi'an

sunset over the Brisbane river, Australia, Aug. 17, 2012Last evening, I read a nice paper with the above title by Drovandi, Pettitt and McCutchan, from QUT, Brisbane. Low count refers to observation with a small number of integer values. The idea is to mix ABC with the unbiased estimators of the likelihood proposed by Andrieu and Roberts (2009) and with particle MCMC… And even with a RJMCMC version. The special feature that makes the proposal work is that the low count features allows for a simulation of pseudo-observations (and auxiliary variables) that may sometimes authorise an exact constraint (that the simulated observation equals the true observation). And which otherwise borrows from Jasra et al. (2013) “alive particle” trick that turns a negative binomial draw into an unbiased estimation of the ABC target… The current paper helped me realise how powerful this trick is. (The original paper was arXived at a time I was off, so I completely missed it…) The examples studied in the paper may sound a wee bit formal, but they could lead to a better understanding of the method since alternatives could be available (?). Note that all those examples are not ABC per se in that the tolerance is always equal to zero.

The paper also includes reversible jump implementations. While it is interesting to see that ABC (in the authors’ sense) can be mixed with RJMCMC, it is delicate to get a feeling about the precision of the results, without a benchmark to compare to. I am also wondering about less costly alternatives like empirical likelihood and other ABC alternatives. Since Chris is visiting Warwick at the moment, I am sure we can discuss this issue next week there.

resampling and [GPU] parallelism

Posted in Statistics, University life with tags , , , , , , on March 13, 2012 by xi'an

In a recent note posted on arXiv, Lawrence Murray compares the implementation of resampling schemes for parallel systems like GPUs. Given a system of weighted particles, (xii), there are several ways of drawing a sample according to those weights:

  1. regular multinomial resampling, where each point in the (new) sample is one of the (xii), with probability (xii), meaning there is a uniform generated for each point;
  2. stratified resampling, where the weights are added, divided into equal pieces and a uniform is sampled on each piece, which means that points with large weights are sampled at least once and those with small weights at most once;
  3. systematic resampling, which is the same as the above except that the same uniform is used for each piece,
  4. Metropolis resampling, where a Markov chain converges to the distribution (ω1,…, ωP) on {1,…,P},

The three first resamplers are common in the particle system literature (incl. Nicolas Chopin’s PhD thesis), but difficult to adapt to GPUs (and I always feel uncomfortable with the fact that systematic uses a single uniform!), while the last one is more unusual, but actually well-fitted for a parallel implementation. While Lawrence Murray suggests using Raftery and Lewis’ (1992) assessment of the required number of Metropolis iterations to “achieve convergence”, I would instead suggest taking advantage of the toric nature of the space (as represented above) to run a random walk and wait for the equivalent of a complete cycle. In any case, this is a cool illustration of the new challenges posed by parallel implementations (like the development of proper random generators).

recent arXiv postings

Posted in Statistics, University life with tags , , , , , on October 17, 2011 by xi'an

Three interesting recent arXiv postings and not enough time to read them all and in the ‘Og bind them! (Of course, comments from readers welcome!)

Formulating a statistical inverse problem as one of inference in a Bayesian model has great appeal, notably for what this brings in terms of coherence, the interpretability of regularisation penalties, the integration of all uncertainties, and the principled way in which the set-up can be elaborated to encompass broader features of the context, such as measurement error, indirect observation, etc. The Bayesian formulation comes close to the way that most scientists intuitively regard the inferential task, and in principle allows the free use of subject knowledge in probabilistic model building. However, in some problems where the solution is not unique, for example in ill-posed inverse problems, it is important to understand the relationship between the chosen Bayesian model and the resulting solution. Taking emission tomography as a canonical example for study, we present results about consistency of the posterior distribution of the reconstruction, and a general method to study convergence of posterior distributions. To study efficiency of Bayesian inference for ill-posed linear inverse problems with constraint, we prove a version of the Bernstein-von Mises theorem for nonregular Bayesian models.

(Certainly unlikely to please the member of the audience in Zürich who questioned my Bayesian credentials for considering “true” models and consistency….)

Recently, Andrieu, Doucet and Holenstein (2010) introduced a general framework for using particle filters (PFs) to construct proposal kernels for Markov chain Monte Carlo (MCMC) methods. This framework, termed Particle Markov chain Monte Carlo (PMCMC), was shown to provide powerful methods for joint Bayesian state and parameter inference in nonlinear/non-Gaussian state-space models. However, the mixing of the resulting MCMC kernels can be quite sensitive, both to the number of particles used in the underlying PF and to the number of observations in the data. In this paper we suggest alternatives to the three PMCMC methods introduced in Andrieu et al. (2010), which are much more robust to a low number of particles as well as a large number of observations. We consider some challenging inference problems and show in a simulation study that, for problems where existing PMCMC methods require around 1000 particles, the proposed methods provide satisfactory results with as few as 5 particles.

(I have not read the paper enough in-depth to be critical, however “hard” figures like 5, or 10³, are always suspicious in that they cannot carry to the general case…)

In this paper we present an algorithm for rapid Bayesian analysis that combines the benefits of nested sampling and artificial neural networks. The blind accelerated multimodal Bayesian inference (BAMBI) algorithm implements the MultiNest package for nested sampling as well as the training of an artificial neural network (NN) to learn the likelihood function. In the case of computationally expensive likelihoods, this allows the substitution of a much more rapid approximation in order to increase significantly the speed of the analysis. We begin by demonstrating, with a few toy examples, the ability of a NN to learn complicated likelihood surfaces. BAMBI’s ability to decrease running time for Bayesian inference is then demonstrated in the context of estimating cosmological parameters from WMAP and other observations. We show that valuable speed increases are achieved in addition to obtaining NNs trained on the likelihood functions for the different model and data combinations. These NNs can then be used for an even faster follow-up analysis using the same likelihood and different priors. This is a fully general algorithm that can be applied, without any pre-processing, to other problems with computationally expensive likelihood functions.

(This is primarily an astronomy paper that uses a sample produced by the nested sampling algorithm MultiNest to build a neural network instead of the model likelihood. The algorithm thus requires the likelihood to be available at some stage.)

workshop in Columbia

Posted in Statistics, Travel, University life with tags , , , , , , , , , on September 24, 2011 by xi'an

The workshop in Columbia University on Computational Methods in Applied Sciences is quite diverse in its topics.  Reminding me of the conference on Efficient Monte Carlo in Sandbjerg Estate, Sønderborg in 2008, celebrating the 70th birthday of Reuven Rubinstein, incl. some colleagues I had not met since this meeting. Yesterday I thus heard (quite interesting) talks on domains somehow far from my own, from Robert Adler on cohomology (giving a second look  at the thing after the talk I head in Wharton last year), to José Blanchet on simulation for infinite server queues (with a link to perfect sampling I could not exactly trace but that was certainly there). Several of the talks made me think of our Brownian motion confidence band paper, with Wilfrid Kendall and Jean-Michel Marin, esp. Gennady Samorodnitsky’s on the maximum of stochastic processes (and wonder whether we could have gone further in that direction). Pierre Del Moral presented a broad overview of the Feynman-Kacs’ approaches to particle methods, in particular particle MCMC, with application to some financial objects. Paul Glasserman talked about robust MCMC, which I found quite an appealing concept in that it included uncertainties about the model itself. And linked with minimax concepts. And Paul Dupuis exposed a parallel tempering method linked with large deviations, whose paper I am definitely looking forward. Now it is more than time to work on my own talk! (On a very personal basis, I sadly lost my sturdy Canon camera in the taxi from the airport! Will need a new one for the ‘Og!)

WSC 2011, Phoenix

Posted in Statistics, Travel, University life with tags , , , , , , , on May 25, 2011 by xi'an

WSC stands for Winter Simulation Conference. I was not aware of those conferences that have been running since 1971 but I have been invited to give a tutorial on simulation in statistics at the 2011 edition of WSC in Phoenix, Arizona. Since it means meeting a completely new community of people using simulation, mostly outside statistics, I accepted the invitation and will thus spend a few days in December in Arizona… (If the culture gap proves too wide, I can always bike, run, or even climb in South Mountain Park!) In connection with this tutorial, I was requested to write a short paper that I just posted on arXiv. There is not much originality in the survey as it is mostly inspired from older chapters written for handbooks. I wished I had more space to cover particle MCMC in a few pages but 12 pages was the upper limit.

Follow

Get every new post delivered to your Inbox.

Join 720 other followers