Archive for ABC-MCMC

thick disc formation scenario of the Milky Way evaluated by ABC

Posted in Statistics, University life with tags , , , , , , , on July 9, 2014 by xi'an

“The facts that the thick-disc episode lasted for several billion years, that a contraction is observed during the collapse phase, and that the main thick disc has a constant scale height with no flare argue against the formation of the thick disc through radial migration. The most probable scenario for the thick disc is that it formed while the Galaxy was gravitationally collapsing from well-mixed gas-rich giant clumps that were sustained by high turbulence, which prevented a thin disc from forming for a time, as proposed previously.”

Following discussions with astronomers from Besancon on the use of ABC methods to approximate posteriors, I was associated with their paper on assessing a formation scenario of the Milky Way, which was accepted a few weeks ago in Astronomy & Astrophysics. The central problem (was there a thin-then-thick disk?) somewhat escapes me, but this collaboration started when some of the astronomers leading the study contacted me about convergence issues with their MCMC algorithms and I realised they were using ABC-MCMC without any idea that it was in fact called ABC-MCMC and had been studied previously in another corner of the literature… The scale in the kernel was chosen to achieve an average acceptance rate of 5%-10%. Model are then compared by the combination of a log-likelihood approximation resulting from the ABC modelling and of a BIC ranking of the models.  (Incidentally, I was impressed at the number of papers published in Astronomy & Astrophysics. The monthly issue contains dozens of papers!)

a pseudo-marginal perspective on the ABC algorithm

Posted in Mountains, pictures, Statistics, University life with tags , , , , , , , , on May 5, 2014 by xi'an

ridge6

My friends Luke Bornn, Natesh Pillai and Dawn Woodard just arXived along with Aaron Smith a short note on the convergence properties of ABC. When compared with acceptance-rejection or regular MCMC. Unsurprisingly, ABC does worse in both cases. What is central to this note is that ABC can be (re)interpreted as a pseudo-marginal method where the data comparison step acts like an unbiased estimator of the true ABC target (not of the original ABC target, mind!). From there, it is mostly an application of Christophe Andrieu’s and Matti Vihola’s results in this setup. The authors also argue that using a single pseudo-data simulation per parameter value is the optimal strategy (as compared with using several), when considering asymptotic variance. This makes sense in terms of simulating in a larger dimensional space but what of the cost of producing those pseudo-datasets against the cost of producing a new parameter? There are a few (rare) cases where the datasets are much cheaper to produce.

¼th i-like workshop in St. Anne’s College, Oxford

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , on March 27, 2014 by xi'an

IMG_0153Due to my previous travelling to and from Nottingham for the seminar and back home early enough to avoid the dreary evening trains from Roissy airport (no luck there, even at 8pm, the RER train was not operating efficiently!, and no fast lane is planed prior to 2023…), I did not see many talks at the i-like workshop. About ¼th, roughly… I even missed the poster session (and the most attractive title of Lazy ABC by Dennis Prangle) thanks to another dreary train ride from Derby to Oxford.

IMG_0150As it happened I had already heard or read parts of the talks in the Friday morning session, but this made understanding them better. As in Banff, Paul Fearnhead‘s talk on reparameterisations for pMCMC on hidden Markov models opened a wide door to possible experiments on those algorithms. The examples in the talk were mostly of the parameter duplication type, somewhat creating unidentifiability to decrease correlation, but I also wondered at the possibility of introducing frequent replicas of the hidden chain in order to fight degeneracy. Then Sumeet Singh gave a talk on the convergence properties of noisy ABC for approximate MLE. Although I had read some of the papers behind the talk, it made me realise how keeping balls around each observation in the ABC acceptance step was not leading to extinction as the number of observations increased. (Summet also had a good line with his ABCDE algorithm, standing for ABC done exactly!) Anthony Lee covered his joint work with Krys Łatuszyński on the ergodicity conditions on the ABC-MCMC algorithm, the only positive case being the 1-hit algorithm as discussed in an earlier post. This result will hopefully get more publicity, as I frequently read that increasing the number of pseudo-samples has no clear impact on the ABC approximation. Krys Łatuszyński concluded the morning with an aggregate of the various results he and his co-authors had obtained on the fascinating Bernoulli factory. Including constructive derivations.

After a few discussions on and around research topics, it was too soon time to take advantage of the grand finale of a March shower to walk from St. Anne’s College to Oxford Station, in order to start the trip back home. I was lucky enough to find a seat and could start experimenting in R the new idea my trip to Nottingham had raised! While discussing a wee bit with my neighbour, a delightful old lady from the New Forest travelling to Coventry, recovering from a brain seizure, wondering about my LaTeX code syntax despite the tiny fonts, and who most suddenly popped a small screen from her bag to start playing Candy Crush!, apologizing all the same. The overall trip was just long enough for my R code to validate this idea of mine, making this week in England quite a profitable one!!! IMG_0145

accelerated ABC

Posted in R, Statistics, Travel, University life with tags , , , , , on October 17, 2013 by xi'an

AF flight to Montpellier, Feb. 07, 2012On the flight back from Warwick, I read a fairly recently arXived paper by Umberto Picchini and Julie Forman entitled “Accelerating inference for diffusions observed with measurement error and large sample sizes using Approximate Bayesian Computation: A case study” that relates to earlier ABC works (and the MATLAB abc-sde package) by the first author (earlier works I missed). Among other things, the authors propose an acceleration device for ABC-MCMC: when simulating from the proposal, the Metropolis-Hastings acceptance probability can be computed and compared with a uniform rv prior to simulating pseudo-data. In case of rejection, the pseudo-data does not need to be simulated. In case of acceptance, it is compared with the observed data as usual. This is interesting for two reasons: first it always speeds up the algorithm. Second, it shows the strict limitations of ABC-MCMC, since the rejection takes place without incorporating the information contained in the data. (Even when the proposal incorporates this information, the comparison with the prior does not go this way.) This also relates to one of my open problems, namely how to simulate directly summary statistics without simulating the whole pseudo-dataset.

Another thing (related with acceleration) is that the authors use a simulated subsample rather than the simulated sample in order to gain time: this worries me somehow as the statistics corresponding to the observed data is based on the whole observed data. I thus wonder how both statistics could be compared, since they have different distributions and variabilities, even when using the same parameter value. Or is this a sort of pluggin/bootstrap principle, the true parameter being replaced with its estimator based on the whole data? Maybe this does not matter in the end (when compared with the several levels of approximation)…

interacting particles ABC

Posted in Statistics with tags , , , , , , on August 27, 2012 by xi'an

Carlo Albert and Hans Kuensch recently posted an arXiv paper which provides a new perspective on ABC. It relates to ABC-MCMC and to ABC-SMC in different ways, but the major point is to propose a sequential schedule for decreasing the tolerance that ensures convergence. Although there exist other proofs of convergence in the literature, this one is quite novel in that it connects ABC with the cooling schedules of simulated annealing. (The fact that the sample size does not appear as in Fearnhead and Prangle and their non-parametric perspective can be deemed less practical, but I think this is simply another perspective on the problem!) The corresponding ABC algorithm is a mix of MCMC and SMC in that it lets a population of N particles evolve in a quasi-independent manner, the population being only used to update the parameters of the independent (normal) proposal and those of the cooling tolerance. Each particle in the population moves according to a Metropolis-Hastings step, but this is not an ABC-MCMC scheme in that the algorithm works with a population at all times, and this is not an ABC-SMC scheme in that there is no weighting and no resampling.

Maybe I can add two remarks about the conclusion: the authors do not seem aware of other works using other penalties than the 0-1 kernel, but those abound, see e.g. the discussion paper of Fearnhead and Prangle. Or Ratmann et al. The other missing connection is about adaptive tolerance construction, which is also found in the literature, see e.g. Doucet et al. or Drovandi and Pettitt.

likelihood-free parallel tempering

Posted in Statistics, University life with tags , , , , , , , , , on August 20, 2011 by xi'an

Meïli Baragatti, Agnès Grimaud, and Denys Pommeret posted an ABC paper on arXiv entitled Likelihood-free parallel tempering. While most ABC methods essentially are tempering methods, in that the tolerance level acts like an energy level, this paper uses parallel chains at various tolerance levels, with an exchange mechanism derived from Geyer and Thomson (1995, JASA). As with regular ABC-MCMC, the acceptance probability is such that the likelihood needs not be computed. On the mixture example of Sisson et al. (2007, PNAS) and on the tuberculosis example of Tanaka et al. (2006, Genetics), the authors report better performances than ABC-PMC, ABC-MCMC and ABC. (In a bimodal toy example,  ABC-PMC does not identify a second mode, which should not be the case with a large enough initial tolerance and a small enough tempering decrease step.) The paper introduces a sequence of temperatures in addition to a sequence of tolerances and it is only through an example that I understood the (unusual) role of the temperatures as scale factors in the  random walk proposal. It seems to me that an annealing step should be added as the chains with larger tolerances are less interesting as time goes on.

Ps-Scott Sisson just signaled on his twitter account the publication of several papers using ABC in monkey evolution. As well as a fourth paper by Wegman et al.. estimating the size of the initial American settlers to be around 100, about 13,000 years ago, all using standard ABC model choice techniques. Scott also pointed out a conference held in Bristol next April 16-19.

Inference in epidemic models w/o likelihoods

Posted in Statistics with tags , , , , , , , , , on July 21, 2010 by xi'an

“We discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.” McKinley, Cook, and Deardon, IJB

I only became aware last week of the paper Inference in epidemic models without likelihoods by McKinley, Cook and Deardon, published in the International Journal of Biostatistics in 2009. (Anyone can access the paper by becoming a guest, ie providing some information.) The paper is essentially a simulation experiment comparing ABC-MCMC and ABC-SMC with regular data augmentation MCMC. The authors experiment on the tolerance level, the choice of metric and of summary statistics, in an exponential inter-event process modelling. The setting is interesting, in particular because it applies to highly dangerous diseases like the Ebola fever—for which there is no known treatment and which is responsible for a 88% decline in observed chimpanzee populations since 2003!—. The conclusions are overall not highly surprising, namely that repeating simulations of the data points given one simulated parameter does not seem to contribute [much] to an improved approximation of the posterior by the ABC sample, that the tolerance level does not seem to be highly influential, that the choice of the summary statistics and of the calibration factors are important, and that ABC-SMC outperforms ABC-MCMC (MCMC remaining the reference). Slightly more surprising is the conclusion that the choice of the distance/metric influences the outcome. (I failed to read in the paper strong arguments supporting the above sentence stolen from the abstract.)

“There are always doubts that the estimated posterior really does correspond to the true posterior.” McKinley, Cook, and Deardon, IJB

On the “negative” side, this paper is missing the recent literature both on the nonparametric aspects of ABC and on the more adaptive [PMC] features of ABC-SMC, as processed in our Biometrika ABC-PMC paper and in Del Moral, Doucet and Jasra. (Again, this is not a criticism in that the paper got published in early 2009.) I think that using past simulations to build the proposal and the next tolerance and, why not!, the relevant statistics, would further the improvement brought by sequential methods. (The authors were aware of the correction of Sisson et al., and used instead the version of Toni et al. They also mention the arXived notes of Marc Beaumont, which started our prodding into ABC-PRC.) The comparison experiment is based on a single dataset, with fixed random walk variances for the MCMC algorithms, while the prior used in the simulation seems to me to be highly peaked around the true value (gamma rates of 0.1). Some of the ABC scenari do produce estimates that are rather far away from the references given by MCMC, take for instance the CABC-MCMC when the tolerance ε is 10 and R is 100.

Follow

Get every new post delivered to your Inbox.

Join 633 other followers