## ABC+EL=no D(ata)

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , , , , on May 28, 2012 by xi'an

It took us a loooong while [for various and uninteresting reasons] but we finally ended up completing a paper on ABC using empirical likelihood (EL) that was started by me listening to Brunero Liseo’s tutorial in O’Bayes-2011 in Shanghai… Brunero mentioned empirical likelihood as a semi-parametric technique w/o much Bayesian connections and this got me thinking of a possible recycling within ABC. I won’t get into the details of empirical likelihood, referring to Art Owen’s book “Empirical Likelihood” for a comprehensive entry, The core idea of empirical likelihood is to use a maximum entropy discrete distribution supported by the data and constrained by estimating equations related with the parameters of interest/of the model. As such, it is a non-parametric approach in the sense that the distribution of the data does not need to be specified, only some of its characteristics. Econometricians have been quite busy at developing this kind of approach over the years, see e.g. Gouriéroux and Monfort’s  Simulation-Based Econometric Methods). However, this empirical likelihood technique can also be seen as a convergent approximation to the likelihood and hence exploited in cases when the exact likelihood cannot be derived. For instance, as a substitute to the exact likelihood in Bayes’ formula. Here is for instance a comparison of a true normal-normal posterior with a sample of 10³ points simulated using the empirical likelihood based on the moment constraint.

The paper we wrote with Kerrie Mengersen and Pierre Pudlo thus examines the consequences of using an empirical likelihood in ABC contexts. Although we called the derived algorithm ABCel, it differs from genuine ABC algorithms in that it does not simulate pseudo-data. Hence the title of this post. (The title of the paper is “Approximate Bayesian computation via empirical likelihood“. It should be arXived by the time the post appears: “Your article is scheduled to be announced at Mon, 28 May 2012 00:00:00 GMT“.) We had indeed started looking at a simulated data version, but it was rather poor, and we thus opted for an importance sampling version where the parameters are simulated from an importance distribution (e.g., the prior) and then weighted by the empirical likelihood (times a regular importance factor if the importance distribution is not the prior). The above graph is an illustration in a toy example.

The difficulty with the method is in connecting the parameters (of interest/of the assumed distribution) with moments of the (iid) data. While this operates rather straightforwardly for quantile distributions, it is less clear for dynamic models like ARCH and GARCH, where we have to reconstruct the underlying iid process. (Where ABCel clearly improves upon ABC for the GARCH(1,1) model but remains less informative than a regular MCMC analysis. Incidentally, this study led to my earlier post on the unreliable garch() function in the tseries package!) And it is even harder for population genetic models, where parameters like divergence dates, effective population sizes, mutation rates, &tc., cannot be expressed as moments of the distribution of the sample at a given locus. In particular, the datapoints are not iid. Pierre Pudlo then had the brilliant idea to resort instead to a composite likelihood, approximating the intra-locus likelihood by a product of pairwise likelihoods over all pairs of genes in the sample at a given locus. Indeed, in Kingman’s coalescent theory, the pairwise likelihoods can be expressed in closed form, hence we can derive the pairwise composite scores. The comparison with optimal ABC outcomes shows an improvement brought by ABCel in the approximation, at an overall computing cost that is negligible against ABC (i.e., it takes minutes to produce the ABCel outcome, compared with hours for ABC.)

We are now looking for extensions and improvements of ABCel, both at the methodological and at the genetic levels, and we would of course welcome any comment at this stage. The paper has been submitted to PNAS, as we hope it should appeal to the ABC community at large, i.e. beyond statisticians…

## garch() uncertainty

Posted in R, Statistics, University life with tags , , , on May 17, 2012 by xi'an

As part of an on-going paper with Kerrie Mengersen and Pierre Pudlo, we are using a GARCH(1,1) model as a target. Thus, the model is of the form

$y_t=\sigma_t \epsilon_t \qquad \sigma^2_t = \alpha_0 + \alpha_1 y_{t-1}^2 + \beta_1 \sigma_{t-1}^2$

which is a somehow puzzling object: the latent (variance) part is deterministic and can be reconstructed exactly given the series and the parameters. However, estimation is not such an easy task and using the garch() function in the tseries package leads to puzzling results! Indeed, simulating data shows some high variability of the procedure against starting values:

genedata=function(para,nobs){

pata=epst=sigt=rnorm(nobs)
sigt[1]=sqrt(para[1])
pata[1]=epst[1]*sigt[1]
for (t in 2:nobs){
sigt[t]=sqrt(para[1]+para[2]*pata[t-1]^2+para[3]*sigt[t-1]^2)
pata[t]=epst[t]*sigt[t]
}
list(pata=pata,sigt=sigt,epst=epst)
}
> x = genedata(c(1, 0.3, 0.2),1000)$pata > garch(x,trace=FALSE) Call: garch(x = x, trace = FALSE) Coefficient(s): a0 a1 b1 4.362e+00 1.976e-01 6.805e-14 > garch(x,trace=FALSE,start=c(1,.3,.2)) Call: garch(x = x, trace = FALSE, start = c(1, 0.3, 0.2)) Coefficient(s): a0 a1 b1 0.8025 0.2592 0.3255 > simgarch=genedata(c(1, 0.2, 0.7),1000) Call: garch(x = simgarch$pat, trace = FALSE)

Coefficient(s):
a0         a1         b1
8.044e+00  1.826e-01  4.051e-14
> garch(simgarch$pat,trace=FALSE,star=c(1, 0.2, 0.7)) Call: garch(x = simgarch$pat, trace = FALSE, star = c(1, 0.2, 0.7))

Coefficient(s):
a0      a1      b1
1.1814  0.2079  0.6590

The above code clearly shows the huge impact of the starting value on the final estimate….