Archive for sufficient statistics

Pre-processing for approximate Bayesian computation in image analysis

Posted in R, Statistics, University life with tags , , , , , , , , , , , , , on March 21, 2014 by xi'an

ridge6With Matt Moores and Kerrie Mengersen, from QUT, we wrote this short paper just in time for the MCMSki IV Special Issue of Statistics & Computing. And arXived it, as well. The global idea is to cut down on the cost of running an ABC experiment by removing the simulation of a humongous state-space vector, as in Potts and hidden Potts model, and replacing it by an approximate simulation of the 1-d sufficient (summary) statistics. In that case, we used a division of the 1-d parameter interval to simulate the distribution of the sufficient statistic for each of those parameter values and to compute the expectation and variance of the sufficient statistic. Then the conditional distribution of the sufficient statistic is approximated by a Gaussian with these two parameters. And those Gaussian approximations substitute for the true distributions within an ABC-SMC algorithm à la Del Moral, Doucet and Jasra (2012).

residuals

Across 20 125 × 125 pixels simulated images, Matt’s algorithm took an average of 21 minutes per image for between 39 and 70 SMC iterations, while resorting to pseudo-data and deriving the genuine sufficient statistic took an average of 46.5 hours for 44 to 85 SMC iterations. On a realistic Landsat image, with a total of 978,380 pixels, the precomputation of the mapping function took 50 minutes, while the total CPU time on 16 parallel threads was 10 hours 38 minutes. By comparison, it took 97 hours for 10,000 MCMC iterations on this image, with a poor effective sample size of 390 values. Regular SMC-ABC algorithms cannot handle this scale: It takes 89 hours to perform a single SMC iteration! (Note that path sampling also operates in this framework, thanks to the same precomputation: in that case it took 2.5 hours for 10⁵ iterations, with an effective sample size of 10⁴…)

Since my student’s paper on Seaman et al (2012) got promptly rejected by TAS for quoting too extensively from my post, we decided to include me as an extra author and submitted the paper to this special issue as well.

ABC with indirect summary statistics

Posted in Statistics, University life with tags , , , , , , , on February 3, 2014 by xi'an

After reading Drovandi’s and Pettitt’s Bayesian Indirect Inference, I checked (in the plane to Birmingham) the earlier Gleim’s and Pigorsch’s Approximate Bayesian Computation with indirect summary statistics. The setting is indeed quite similar to the above, with a description of three ways of connecting indirect inference with ABC, albeit with a different range of illustrations. This preprint states most clearly its assumption that the generating model is a particular case of the auxiliary model, which sounds anticlimactic since the auxiliary model is precisely used because the original one is mostly out of reach! This certainly was the original motivation for using indirect inference.

The part of the paper that I find the most intriguing is the argument that the indirect approach leads to sufficient summary statistics, in the sense that they “are sufficient for the parameters of the auxiliary model and (…) sufficiency carries over to the model of interest” (p.31). Looking at the details in the Appendix, I found that the argument is lacking, because the likelihood as a functional is shown to be a (sufficient) statistic, which seems both a tautology and irrelevant because this is different from the likelihood considered at the (auxiliary) MLE, which is the summary statistic used in fine.

“…we expand the square root of an innovation density h in a Hermite expansion and truncate the in finite polynomial at some integer K which, together with other tuning parameters of the SNP density, has to be determined through a model selection criterion (such as BIC). Now we take the leading term of the Hermite expansion to follow a Gaussian GARCH model.”

As in Drovandi and Pettitt, the performances of the ABC-I schemes are tested on a toy example, which is a very basic exponential iid sample with a conjugate prior. With a gamma model as auxiliary. The authors use a standard ABC based on the first two moments as their benchmark, however they do not calibrate those moments in the distance and end up with poor performances of ABC (in a setting where there is a sufficient statistic!). The best choice in this experiment appears as the solution based on the score, but the variances of the distances are not included in the comparison tables. The second implementation considered in the paper is a rather daunting continuous-time non-Gaussian Ornstein-Uhlenbeck stochastic volatility model à la Barndorf -Nielsen and Shephard (2001). The construction of the semi-nonparametric (why not semi-parametric?) auxiliary model is quite involved as well, as illustrated by the quote above. The approach provides an answer, with posterior ABC-IS distributions on all parameters of the original model, which kindles the question of the validation of this answer in terms of the original posterior. Handling simultaneously several approximation processes would help in this regard.

ABC model choice [slides]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on November 7, 2011 by xi'an

Here are the slides for my talks both at CREST this afternoon (in ½ an hour!) and in Madrid [on Friday 11/11/11=16, magical day of the year, especially since I will be speaking at 11:11 CET...] for the Workshop Métodos Bayesianos 11 (no major difference with the slides from Zürich, hey!, except for the quantile distribution example]

workshop in Columbia [talk]

Posted in Statistics, Travel, University life with tags , , , , , , , , , on September 25, 2011 by xi'an

Here are the slides of my talk yesterday at the Computational Methods in Applied Sciences workshop in Columbia:

The last section of the talk covers our new results with Jean-Michel Marin, Natesh Pillai and Judith Rousseau on the necessary and sufficient conditions for a summary statistic to be used in ABC model choice. (The paper is about to be completed.) This obviously comes as the continuation of our reflexions on  ABC model choice started last January. The major message of the paper is that the statistics used for running model choice cannot have a mean value common to both models, which strongly implies using ancillary statistics with different means under each model. (I am afraid that, thanks to the mixture of no-jetlag fatigue and of slide inflation [95 vs. 40mn] and of asymptotics technicalities in the last part, the talk was far from comprehensible. I started on the wrong foot with not getting an XL [Xiao-Li's] comment on the measure-theory problem with the limit in ε going to zero. A peak given that great debate we had in Banff with Jean-Michel, David Balding, and Mark Beaumont, years ago. And our more recent paper about the arbitrariness of the density value in the Savage-Dickey paradox. I then compounded the confusion by stating the empirical mean was sufficient in the Laplace case…which is not even an exponential family. I hope I will be more articulate next week in Zürich where at least I will not speak past my bedtime!)

ABC and sufficient statistics

Posted in Statistics, University life with tags , , , , , , , on July 8, 2011 by xi'an

Chris Barnes, Sarah Filippi, Michael P.H. Stumpf, and Thomas Thorne posted a paper on arXiv on the selection of sufficient statistics towards ABC model choice. This paper, called Considerate Approaches to Achieving Sufficiency for ABC model selection, was presented by Chris Barnes during ABC in London two months ago. (Note that all talks of the meeting are now available in Nature Precedings. A neat concept by the way!) This paper of them builds on our earlier warning about (unfounded) ABC model selection to propose a selection of summary statistics that partly alleviates the  original problem. (The part about the discrepancy with the true posterior probability remains to be addressed. As does the issue of whether or not the selected collection of statistics provides a convergent model choice inference. We are currently working on it…) Their section “Resuscitating ABC model choice” states quite clearly the goal of the paper:

- this [use of inadequate summary statistics] mirrors problems that can also be observed in the parameter estimation context,
- for many important, and arguably the most important applications of ABC, this problem can in principle be avoided by using the whole data rather than summary statistics,
- in cases where summary statistics are required, we argue that we can construct approximately sufficient statistics in a disciplined manner,
- when all else fails, a change in perspective, allows us to nevertheless make use of the flexibility of the ABC framework

The driving idea in the paper is to use an entropy approximation to measure the lack of information due to the use of a given set of summary statistics. The corresponding algorithm then proceeds from a starting pool of summary statistics to build sequentially a collection of the most informative summary statistics (which, in a sense, reminded me of a variable selection procedure based on Kullback-Leibler, we developed with  Costas Goutis and Jérôme Dupuis). It is a very interesting advance in the issue of ABC model selection, even though it cannot eliminate all stumbling blocks. The interpretation that ABC should be processed as an inferential method on its own rather than an approximation to Bayesian inference is clearly appealing. (Fearnhead and Prangle, and Dean, Singh, Jasra and Peters could be quoted as well.)

Continue reading

Follow

Get every new post delivered to your Inbox.

Join 603 other followers