## ABC+EL=no D(ata)

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , , , , on May 28, 2012 by xi'an

It took us a loooong while [for various and uninteresting reasons] but we finally ended up completing a paper on ABC using empirical likelihood (EL) that was started by me listening to Brunero Liseo’s tutorial in O’Bayes-2011 in Shanghai… Brunero mentioned empirical likelihood as a semi-parametric technique w/o much Bayesian connections and this got me thinking of a possible recycling within ABC. I won’t get into the details of empirical likelihood, referring to Art Owen’s book “Empirical Likelihood” for a comprehensive entry, The core idea of empirical likelihood is to use a maximum entropy discrete distribution supported by the data and constrained by estimating equations related with the parameters of interest/of the model. As such, it is a non-parametric approach in the sense that the distribution of the data does not need to be specified, only some of its characteristics. Econometricians have been quite busy at developing this kind of approach over the years, see e.g. Gouriéroux and Monfort’s  Simulation-Based Econometric Methods). However, this empirical likelihood technique can also be seen as a convergent approximation to the likelihood and hence exploited in cases when the exact likelihood cannot be derived. For instance, as a substitute to the exact likelihood in Bayes’ formula. Here is for instance a comparison of a true normal-normal posterior with a sample of 10³ points simulated using the empirical likelihood based on the moment constraint.

The paper we wrote with Kerrie Mengersen and Pierre Pudlo thus examines the consequences of using an empirical likelihood in ABC contexts. Although we called the derived algorithm ABCel, it differs from genuine ABC algorithms in that it does not simulate pseudo-data. Hence the title of this post. (The title of the paper is “Approximate Bayesian computation via empirical likelihood“. It should be arXived by the time the post appears: “Your article is scheduled to be announced at Mon, 28 May 2012 00:00:00 GMT“.) We had indeed started looking at a simulated data version, but it was rather poor, and we thus opted for an importance sampling version where the parameters are simulated from an importance distribution (e.g., the prior) and then weighted by the empirical likelihood (times a regular importance factor if the importance distribution is not the prior). The above graph is an illustration in a toy example.

The difficulty with the method is in connecting the parameters (of interest/of the assumed distribution) with moments of the (iid) data. While this operates rather straightforwardly for quantile distributions, it is less clear for dynamic models like ARCH and GARCH, where we have to reconstruct the underlying iid process. (Where ABCel clearly improves upon ABC for the GARCH(1,1) model but remains less informative than a regular MCMC analysis. Incidentally, this study led to my earlier post on the unreliable garch() function in the tseries package!) And it is even harder for population genetic models, where parameters like divergence dates, effective population sizes, mutation rates, &tc., cannot be expressed as moments of the distribution of the sample at a given locus. In particular, the datapoints are not iid. Pierre Pudlo then had the brilliant idea to resort instead to a composite likelihood, approximating the intra-locus likelihood by a product of pairwise likelihoods over all pairs of genes in the sample at a given locus. Indeed, in Kingman’s coalescent theory, the pairwise likelihoods can be expressed in closed form, hence we can derive the pairwise composite scores. The comparison with optimal ABC outcomes shows an improvement brought by ABCel in the approximation, at an overall computing cost that is negligible against ABC (i.e., it takes minutes to produce the ABCel outcome, compared with hours for ABC.)

We are now looking for extensions and improvements of ABCel, both at the methodological and at the genetic levels, and we would of course welcome any comment at this stage. The paper has been submitted to PNAS, as we hope it should appeal to the ABC community at large, i.e. beyond statisticians…

## Tutorial in Shanghai

Posted in Statistics, Travel, University life with tags , , , , , on June 11, 2011 by xi'an

I have reached Shanghai yesterday night after a relaxing night trip in an almost empty plane (and a painful taxi ride that seemed about as long!), ate an enjoyable dinner with Larry and Linda, but could not connect to the ‘Og‘s admin site, despite having a great hotel connection… I am thus posting through email the link to my O’Bayes 2011 tutorial slides for today, hoping it will come out fine! If not, apologies! (It worked!)

## O’Bayes 2011 program

Posted in Statistics, Travel, University life with tags , , , on May 11, 2011 by xi'an

The program of the O’Bayes 2011 meeting in Shanghai is now on-line. The first day, June 11, is a tutorial session given by the organisers, as usual, and I will give a short talk on computational methods (Monte Carlo, MCMC and ABC) for Bayesian inference. The regular talks start on Sunday, June 12, and close on Wednesday, June 15, with two afternoon breaks on June 13 and 14, including a cruise on the Huangpu river. The venue for the workshop is  on Putuo Campus, ECNU. The early registration is extended till May 15 and the final deadline is May 31. Again, this is a good opportunity to meet new Bayesians from China and to explore Shanghai on the side… (I will unfortunately do little of this, being due in Paris on the 13th, and so spending only 30 hours in China, and thus missing a large chunk of the workshop.)

## Bayesian conference in 上海 [O'Bayes 2011]

Posted in Statistics, Travel, University life with tags , , , , , on April 29, 2011 by xi'an

Just to remind ‘Og‘s readers that the 2011 International Workshop on Objective Bayes Methodology will take place on June 11-15th 2011 at East China Normal University  (Putuo Campus), Shanghai (上海), China. The deadline for early registration has been extended till May 10. This should be a wonderful opportunity to exchange about the latest in objective Bayes methodology, to meet Chinese Bayesians and to discover that part of China. (Actually, there is a possibility of an excursion to Guilin, Xi’an and Beijing after the meeting!) Here is a picture of my visa, received today.

## O’Bayes [20]11 in Shanghai

Posted in Statistics with tags , , , , , , on March 16, 2011 by xi'an

Following the O’Bayes [20]09 meeting in Philly, the next O’Bayes meeting will take place in Shanghai, June 11-15, at the East China Normal University to be precise. Information can be found on the website of the conference. Here is the announcement:

The principal objectives of OBayes2011 are to facilitate the exchange of  recent research developments in objective Bayes methodology, to provide opportunities for new researchers to shine, and to establish new collaborations and partnerships that will channel efforts into pending problems and open new directions for further study. O-Bayes2011 will also  serve to further crystallize objective Bayes methodology as an established  area for statistical research.
The workshop will consist of a tutorial session, a series of invited talks  followed by discussion and a poster session dedicated to contributed work.

The speakers are already listed on the website. (I will most likely attend the first two days of the conference and give an MCMC tutorial there, but unfortunately cannot stay for the whole duration of this workshop…) Note that the registration is rather high (\$300 for regular participants and \$150 for students) but that it includes full catering (which may be a debatable benefit if you want to sample the local delicacies!) and that the hotel is quite cheap if I understand correctly (\$45 per night). There is also a Jim Berger travel grant for graduate students.