Archive for summary statistics

Pre-processing for approximate Bayesian computation in image analysis

Posted in R, Statistics, University life with tags , , , , , , , , , , , , , on March 21, 2014 by xi'an

ridge6With Matt Moores and Kerrie Mengersen, from QUT, we wrote this short paper just in time for the MCMSki IV Special Issue of Statistics & Computing. And arXived it, as well. The global idea is to cut down on the cost of running an ABC experiment by removing the simulation of a humongous state-space vector, as in Potts and hidden Potts model, and replacing it by an approximate simulation of the 1-d sufficient (summary) statistics. In that case, we used a division of the 1-d parameter interval to simulate the distribution of the sufficient statistic for each of those parameter values and to compute the expectation and variance of the sufficient statistic. Then the conditional distribution of the sufficient statistic is approximated by a Gaussian with these two parameters. And those Gaussian approximations substitute for the true distributions within an ABC-SMC algorithm à la Del Moral, Doucet and Jasra (2012).

residuals

Across 20 125 × 125 pixels simulated images, Matt’s algorithm took an average of 21 minutes per image for between 39 and 70 SMC iterations, while resorting to pseudo-data and deriving the genuine sufficient statistic took an average of 46.5 hours for 44 to 85 SMC iterations. On a realistic Landsat image, with a total of 978,380 pixels, the precomputation of the mapping function took 50 minutes, while the total CPU time on 16 parallel threads was 10 hours 38 minutes. By comparison, it took 97 hours for 10,000 MCMC iterations on this image, with a poor effective sample size of 390 values. Regular SMC-ABC algorithms cannot handle this scale: It takes 89 hours to perform a single SMC iteration! (Note that path sampling also operates in this framework, thanks to the same precomputation: in that case it took 2.5 hours for 10⁵ iterations, with an effective sample size of 10⁴…)

Since my student’s paper on Seaman et al (2012) got promptly rejected by TAS for quoting too extensively from my post, we decided to include me as an extra author and submitted the paper to this special issue as well.

JSM2014, Boston

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on December 3, 2013 by xi'an

I submitted my abstract for JSM2014, just in time! Thanks to Veronika Rockova, now at The Wharton School, for organising this IMS session on Advances in Model Selection (Wednesday, 8/6/2014, 8:30)!

itle:
Abstract:
Key words:

Maximum likelihood vs. likelihood-free quantum system identification in the atom maser

Posted in Books, Statistics, University life with tags , , , , , , on December 2, 2013 by xi'an

This paper (arXived a few days ago) compares maximum likelihood with different ABC approximations in a quantum physic setting and for an atom maser modelling that essentially bears down to a hidden Markov model. (I mostly blanked out of the physics explanations so cannot say I understand the model at all.) While the authors (from the University of Nottingham, hence Robin’s statue above…) do not consider the recent corpus of work by Ajay Jasra and coauthors (some of which was discussed on the ‘Og), they get interesting findings for an equally interesting model. First, when comparing the Fisher informations on the sole parameter of the model, the “Rabi angle” φ, for two different sets of statistics, one gets to zero at a certain value of the parameter, while the (fully informative) other is maximum (Figure 6). This is quite intriguing, esp. give the shape of the information in the former case, which reminds me of (my) inverse normal distributions. Second, the authors compare different collections of summary statistics in terms of ABC distributions against the likelihood function. While most bring much more uncertainty in the analysis, the whole collection recovers the range and shape of the likelihood function, which is nice. Third, they also use a kolmogorov-Smirnov distance to run their ABC, which is enticing, except that I cannot fathom from the paper when one would have enough of a sample (conditional on a parameter value) to rely on what is essentially an estimate of the sampling distribution. This seems to contradict the fact that they only use seven summary statistics. Or it may be that the “statistic” of waiting times happens to be a vector, in which case a Kolmogorov-Smirnov distance can indeed be adopted for the distance… The fact that the grouped seven-dimensional summary statistic provides the best ABC fit is somewhat of a surprise when considering the problem enjoys a single parameter.

“However, in practice, it is often difficult to find an s(.) which is sufficient.”

Just a point that irks me in most ABC papers is to find quotes like the above, since in most models, it is easy to show that there cannot be a non-trivial sufficient statistic! As soon as one leaves the exponential family cocoon, one is doomed in this respect!!!

accelerated ABC

Posted in R, Statistics, Travel, University life with tags , , , , , on October 17, 2013 by xi'an

AF flight to Montpellier, Feb. 07, 2012On the flight back from Warwick, I read a fairly recently arXived paper by Umberto Picchini and Julie Forman entitled “Accelerating inference for diffusions observed with measurement error and large sample sizes using Approximate Bayesian Computation: A case study” that relates to earlier ABC works (and the MATLAB abc-sde package) by the first author (earlier works I missed). Among other things, the authors propose an acceleration device for ABC-MCMC: when simulating from the proposal, the Metropolis-Hastings acceptance probability can be computed and compared with a uniform rv prior to simulating pseudo-data. In case of rejection, the pseudo-data does not need to be simulated. In case of acceptance, it is compared with the observed data as usual. This is interesting for two reasons: first it always speeds up the algorithm. Second, it shows the strict limitations of ABC-MCMC, since the rejection takes place without incorporating the information contained in the data. (Even when the proposal incorporates this information, the comparison with the prior does not go this way.) This also relates to one of my open problems, namely how to simulate directly summary statistics without simulating the whole pseudo-dataset.

Another thing (related with acceleration) is that the authors use a simulated subsample rather than the simulated sample in order to gain time: this worries me somehow as the statistics corresponding to the observed data is based on the whole observed data. I thus wonder how both statistics could be compared, since they have different distributions and variabilities, even when using the same parameter value. Or is this a sort of pluggin/bootstrap principle, the true parameter being replaced with its estimator based on the whole data? Maybe this does not matter in the end (when compared with the several levels of approximation)…

proper likelihoods for Bayesian analysis

Posted in Books, Statistics, University life with tags , , , , , , , on April 11, 2013 by xi'an

While in Montpellier yesterday (where I also had the opportunity of tasting an excellent local wine!), I had a look at the 1992 Biometrika paper by Monahan and Boos on “Proper likelihoods for Bayesian analysis“. This is a paper I missed and that was pointed out to me during the discussions in Padova. The main point of this short paper is to decide when a method based on an approximative likelihood function is truly (or properly) Bayes. Just the very question a bystander would ask of ABC methods, wouldn’t it?! The validation proposed by Monahan and Boos is one of calibration of credible sets, just as in the recent arXiv paper of Dennis Prangle, Michael Blum, G. Popovic and Scott Sisson I reviewed three months ago. The idea is indeed to check by simulation that the true posterior coverage of an α-level set equals the nominal coverage α. In other words, the predictive based on the likelihood approximation should be uniformly distributed and this leads to a goodness-of-fit test based on simulations. As in our ABC model choice paper, Proper likelihoods for Bayesian analysis notices that Bayesian inference drawn upon an insufficient statistic is proper and valid, simply less accurate than the Bayesian inference drawn upon the whole dataset. The paper also enounces a conjecture:

A [approximate] likelihood L is a coverage proper Bayesian likelihood if and inly if L has the form L(y|θ) = c(s) g(s|θ) where s=S(y) is a statistic with density g(s|θ) and c(s) some function depending on s alone.

conjecture that sounds incorrect in that noisy ABC is also well-calibrated. (I am not 100% sure of this argument, though.) An interesting section covers the case of pivotal densities as substitute likelihoods and of the confusion created by the double meaning of the parameter θ. The last section is also connected with ABC in that Monahan and Boos reflect on the use of large sample approximations, like normal distributions for estimates of θ which are a special kind of statistics, but do not report formal results on the asymptotic validation of such approximations. All in all, a fairly interesting paper!

Reading this highly interesting paper also made me realise that the criticism I had made in my review of Prangle et al. about the difficulty for this calibration method to address the issue of summary statistics was incorrect: when using the true likelihood function, the use of an arbitrary summary statistics is validated by this method and is thus proper.

Follow

Get every new post delivered to your Inbox.

Join 551 other followers