Archive for Statistics and Computing

Statistics & Computing [toc]

Posted in Books, Statistics with tags , , , , on June 29, 2016 by xi'an

The latest [June] issue of Statistics & Computing is full of interesting Bayesian and Monte Carlo entries, some of which are even open access!

 

more of the same!

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on December 10, 2015 by xi'an

aboriginal artist, NGV, Melbourne, July 30, 2012Daniel Seita, Haoyu Chen, and John Canny arXived last week a paper entitled “Fast parallel SAME Gibbs sampling on general discrete Bayesian networks“.  The distributions of the observables are defined by full conditional probability tables on the nodes of a graphical model. The distributions on the latent or missing nodes of the network are multinomial, with Dirichlet priors. To derive the MAP in such models, although this goal is not explicitly stated in the paper till the second page, the authors refer to the recent paper by Zhao et al. (2015), discussed on the ‘Og just as recently, which applies our SAME methodology. Since the paper is mostly computational (and submitted to ICLR 2016, which takes place juuust before AISTATS 2016), I do not have much to comment about it. Except to notice that the authors mention our paper as “Technical report, Statistics and Computing, 2002”. I am not sure the editor of Statistics and Computing will appreciate! The proper reference is in Statistics and Computing, 12:77-84, 2002.

“We argue that SAME is beneficial for Gibbs sampling because it helps to reduce excess variance.”

Still, I am a wee bit surprised at both the above statement and at the comparison with a JAGS implementation. Because SAME augments the number of latent vectors as the number of iterations increases, so should be slower by a mere curse of dimension,, slower than a regular Gibbs with a single latent vector. And because I do not get either the connection with JAGS: SAME could be programmed in JAGS, couldn’t it? If the authors means a regular Gibbs sampler with no latent vector augmentation, the comparison makes little sense as one algorithm aims at the MAP (with a modest five replicas), while the other encompasses the complete posterior distribution. But this sounds unlikely when considering that the larger the number m of replicas the better their alternative to JAGS. It would thus be interesting to understand what the authors mean by JAGS in this setup!

Statistics and Computing special issue on BNP

Posted in Books, Statistics, University life with tags , , , , , , on June 16, 2015 by xi'an

[verbatim from the call for papers:]

Statistics and Computing is preparing a special issue on Bayesian Nonparametrics, for publication by early 2016. We invite researchers to submit manuscripts for publication in the special issue. We expect that the focus theme will increase the visibility and impact of papers in the volume.

By making use of infinite-dimensional mathematical structures, Bayesian nonparametric statistics allows the complexity of a learned model to grow as the size of a data set grows. This flexibility can be particularly suited to modern data sets but can also present a number of computational and modelling challenges. In this special issue, we will showcase novel applications of Bayesian nonparametric models, new computational tools and algorithms for learning these models, and new models for the diverse structures and relations that may be present in data.

To submit to the special issue, please use the Statistics and Computing online submission system. To indicate consideration for the special issue, choose “Special Issue: Bayesian Nonparametrics” as the article type. Papers must be prepared in accordance with the Statistics and Computing journal guidelines.

Papers will go through the usual peer review process. The special issue website will be updated with any relevant deadlines and information.

Deadline for manuscript submission: August 20, 2015

Guest editors:
Tamara Broderick (MIT)
Katherine Heller (Duke)
Peter Mueller (UT Austin)

scalable Bayesian inference for the inverse temperature of a hidden Potts model

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , on April 7, 2015 by xi'an

Brisbane, summer 2008Matt Moores, Tony Pettitt, and Kerrie Mengersen arXived a paper yesterday comparing different computational approaches to the processing of hidden Potts models and of the intractable normalising constant in the Potts model. This is a very interesting paper, first because it provides a comprehensive survey of the main methods used in handling this annoying normalising constant Z(β), namely pseudo-likelihood, the exchange algorithm, path sampling (a.k.a., thermal integration), and ABC. A massive simulation experiment with individual simulation times up to 400 hours leads to select path sampling (what else?!) as the (XL) method of choice. Thanks to a pre-computation of the expectation of the sufficient statistic E[S(Z)|β].  I just wonder why the same was not done for ABC, as in the recent Statistics and Computing paper we wrote with Matt and Kerrie. As it happens, I was actually discussing yesterday in Columbia of potential if huge improvements in processing Ising and Potts models by approximating first the distribution of S(X) for some or all β before launching ABC or the exchange algorithm. (In fact, this is a more generic desiderata for all ABC methods that simulating directly if approximately the summary statistics would being huge gains in computing time, thus possible in final precision.) Simulating the distribution of the summary and sufficient Potts statistic S(X) reduces to simulating this distribution with a null correlation, as exploited in Cucala and Marin (2013, JCGS, Special ICMS issue). However, there does not seem to be an efficient way to do so, i.e. without reverting to simulating the entire grid X…

Bayesian computation: fore and aft

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on February 6, 2015 by xi'an

BagneuxWith my friends Peter Green (Bristol), Krzysztof Łatuszyński (Warwick) and Marcello Pereyra (Bristol), we just arXived the first version of “Bayesian computation: a perspective on the current state, and sampling backwards and forwards”, which first title was the title of this post. This is a survey of our own perspective on Bayesian computation, from what occurred in the last 25 years [a  lot!] to what could occur in the near future [a lot as well!]. Submitted to Statistics and Computing towards the special 25th anniversary issue, as announced in an earlier post.. Pulling strength and breadth from each other’s opinion, we have certainly attained more than the sum of our initial respective contributions, but we are welcoming comments about bits and pieces of importance that we miss and even more about promising new directions that are not posted in this survey. (A warning that is should go with most of my surveys is that my input in this paper will not differ by a large margin from ideas expressed here or in previous surveys.)

Pre-processing for approximate Bayesian computation in image analysis

Posted in R, Statistics, University life with tags , , , , , , , , , , , , , on March 21, 2014 by xi'an

ridge6With Matt Moores and Kerrie Mengersen, from QUT, we wrote this short paper just in time for the MCMSki IV Special Issue of Statistics & Computing. And arXived it, as well. The global idea is to cut down on the cost of running an ABC experiment by removing the simulation of a humongous state-space vector, as in Potts and hidden Potts model, and replacing it by an approximate simulation of the 1-d sufficient (summary) statistics. In that case, we used a division of the 1-d parameter interval to simulate the distribution of the sufficient statistic for each of those parameter values and to compute the expectation and variance of the sufficient statistic. Then the conditional distribution of the sufficient statistic is approximated by a Gaussian with these two parameters. And those Gaussian approximations substitute for the true distributions within an ABC-SMC algorithm à la Del Moral, Doucet and Jasra (2012).

residuals

Across 20 125 × 125 pixels simulated images, Matt’s algorithm took an average of 21 minutes per image for between 39 and 70 SMC iterations, while resorting to pseudo-data and deriving the genuine sufficient statistic took an average of 46.5 hours for 44 to 85 SMC iterations. On a realistic Landsat image, with a total of 978,380 pixels, the precomputation of the mapping function took 50 minutes, while the total CPU time on 16 parallel threads was 10 hours 38 minutes. By comparison, it took 97 hours for 10,000 MCMC iterations on this image, with a poor effective sample size of 390 values. Regular SMC-ABC algorithms cannot handle this scale: It takes 89 hours to perform a single SMC iteration! (Note that path sampling also operates in this framework, thanks to the same precomputation: in that case it took 2.5 hours for 10⁵ iterations, with an effective sample size of 10⁴…)

Since my student’s paper on Seaman et al (2012) got promptly rejected by TAS for quoting too extensively from my post, we decided to include me as an extra author and submitted the paper to this special issue as well.

Statistics and Computing special MCMSk’issue [call for papers]

Posted in Books, Mountains, R, Statistics, University life with tags , , , , , , , , , , , on February 7, 2014 by xi'an

moonriseFollowing the exciting and innovative talks, posters and discussions at MCMski IV, the editor of Statistics and Computing, Mark Girolami (who also happens to be the new president-elect of the BayesComp section of ISBA, which is taking over the management of future MCMski meetings), kindly proposed to publish a special issue of the journal open to all participants to the meeting. Not only to speakers, mind, but to all participants.

So if you are interested in submitting a paper to this special issue of a computational statistics journal that is very close to our MCMski themes, I encourage you to do so. (Especially if you missed the COLT 2014 deadline!) The deadline for submissions is set on March 15 (a wee bit tight but we would dearly like to publish the issue in 2014, namely the same year as the meeting.) Submissions are to be made through the Statistics and Computing portal, with a mention that they are intended for the special issue.

An editorial committee chaired by Antonietta Mira and composed of Christophe Andrieu, Brad Carlin, Nicolas Chopin, Jukka Corander, Colin Fox, Nial Friel, Chris Holmes, Gareth Jones, Peter Müller, Antonietta Mira, Geoff Nicholls, Gareth Roberts, Håvård Rue, Robin Ryder, and myself, will examine the submissions and get back within a few weeks to the authors. In a spirit similar to the JRSS Read Paper procedure, submissions will first be examined collectively, before being sent to referees. We plan to publish the reviews as well, in order to include a global set of comments on the accepted papers. We intend to do it in The Economist style, i.e. as a set of edited anonymous comments. Usual instructions for Statistics and Computing apply, with the additional requirements that the paper should be around 10 pages and include at least one author who took part in MCMski IV.

Follow

Get every new post delivered to your Inbox.

Join 1,068 other followers