## ABC with composite score functions

Posted in Books, pictures, Statistics, University life with tags , , , , , , , on December 12, 2013 by xi'an

My friends Erlis Ruli, Nicola Sartori and Laura Ventura from Università degli Studi de Padova have just arXived a new paper entitled Approximate Bayesian Computation with composite score functions. While the paper provides a survey of composite likelihood methods, the core idea of the paper is to use the score function (of the composite likelihood) as the summary statistic,

$\dfrac{\partial\,c\ell(\theta;y)}{\partial\,\theta},$

when evaluated at the maximum composite likelihood at the observed data point. In the specific (but unrealistic) case of an exponential family, an ABC based on the score is asymptotically (i.e., as the tolerance ε goes to zero) exact. The choice of the composite likelihood thus induces a natural summary statistics and, as in our empirical likelihood paper, where we also use the score of a composite likelihood, the composite likelihoods that are available for computation are usually quite a few, thus leading to an automated choice of a summary statistic..

An interesting (common) feature in most examples found in this paper is that comparisons are made between ABC using the (truly) sufficient statistic and ABC based on the pairwise score function, which essentially relies on the very same statistics. So the difference, when there is a difference, pertains to the choice of a different combination of the summary statistics or, somehow equivalently to the choice of a different distance function. One of the examples starts from our MA(2) toy-example in the 2012 survey in Statistics and Computing. The composite likelihood is then based on the consecutive triplet marginal densities. As shown by the picture below, the composite version improves to some extent upon the original ABC solution using three autocorrelations.

A suggestion I would have about a refinement of the proposed method deals with the distance utilised in the paper, namely the sum of the absolute differences between the statistics. Indeed, this sum is not scaled at all, neither for regular ABC nor for composite ABC, while the composite likelihood perspective provides in addition to the score a natural metric through the matrix A(θ) [defined on page 12]. So I would suggest comparing the performances of the methods using instead this rescaling since, in my opinion and in contrast with a remark on page 13, it is relevant in some (many?) settings where the amount of information brought by the composite model widely varies from one parameter to the next.

## JSM2014, Boston

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on December 3, 2013 by xi'an

I submitted my abstract for JSM2014, just in time! Thanks to Veronika Rockova, now at The Wharton School, for organising this IMS session on Advances in Model Selection (Wednesday, 8/6/2014, 8:30)!

 itle: Automated variable selection for ABC algorithms Abstract: We discuss here recent advances made in the selection of summaries for approximate Bayesian computation (ABC). In particular, we emphasize the appeal of using machine learning tools such as random forests to build in an automated version summary statistics of a minimum dimension. Conditional to sufficient progress being made in this direction, we will also discuss why and how ABC methods have to be adapted when analyzing large molecular datasets and will present some progress concerning Single Nucleotide Polymorphism (SNP) data. Key words: Bayesian computation, ABC, SNP, model selection

## Au Luxembourg

Posted in pictures, Statistics, Travel, University life with tags , , , , , , on December 3, 2013 by xi'an

In a “crazy travelling week” (dixit my daughter), I gave a talk at an IYS 2013 conference organised by Stephen Senn (formerly at Glasgow) and colleagues in the city of Luxembourg, Grand Duché du Luxembourg. I enjoyed very much the morning train trip there as it was a misty morning, with the sun rising over the frosted-white countryside. (I cannot say much about the city of Luxembourg itself though as I only walked the kilometre from the station to the conference hotel and the same way back. There was a huge gap on the plateau due to a river in the middle, which would have been a nice place to run, I presume…)

One of the few talks I attended there was about an econometric model with instrumental variables. In general, and this dates back to my student’s years at ENSAE, I do not get the motivation for the distinction between endogenous and exogenous in econometrics models. Especially in non-parametric models as, if we do not want to make parametric assumptions, we have difficulties in making instead correlation hypotheses… My bent would be to parametrise everything under the suspicion of this everything being correlated with everything. The instrumental variables econometricians seem so fond of appear to me like magical beings, since we have to know they are instrumental. And because they seem to allow to always come back to a linear setting, by eliminating the non-linear parts. Sounds like a “more for less” free-lunch deal. (Any pointer would be appreciated.) The speaker there actually acknowledged (verbatim) that they are indeed magical and that they cannot be justified by mathematics or statistics. A voodoo part of econometrics then?!

A second talk that left me perplexed was about a generalised finite mixture model. The model sounded like a mixture along time of individuals, ie a sort of clustering of longitudinal data. It looked like it should be easier to estimate than usual mixtures of regressions because an individual contributed to the same regression line for all the times when it was observed. The talk was uninspiring as it missed connections to EM and to Bayesian solutions, focussing instead on a gradient method that sounded inappropriate for a multimodal likelihood. (Funny enough, the choice in the number of regressions was done by BIC.)

## Maximum likelihood vs. likelihood-free quantum system identification in the atom maser

Posted in Books, Statistics, University life with tags , , , , , , on December 2, 2013 by xi'an

This paper (arXived a few days ago) compares maximum likelihood with different ABC approximations in a quantum physic setting and for an atom maser modelling that essentially bears down to a hidden Markov model. (I mostly blanked out of the physics explanations so cannot say I understand the model at all.) While the authors (from the University of Nottingham, hence Robin’s statue above…) do not consider the recent corpus of work by Ajay Jasra and coauthors (some of which was discussed on the ‘Og), they get interesting findings for an equally interesting model. First, when comparing the Fisher informations on the sole parameter of the model, the “Rabi angle” φ, for two different sets of statistics, one gets to zero at a certain value of the parameter, while the (fully informative) other is maximum (Figure 6). This is quite intriguing, esp. give the shape of the information in the former case, which reminds me of (my) inverse normal distributions. Second, the authors compare different collections of summary statistics in terms of ABC distributions against the likelihood function. While most bring much more uncertainty in the analysis, the whole collection recovers the range and shape of the likelihood function, which is nice. Third, they also use a kolmogorov-Smirnov distance to run their ABC, which is enticing, except that I cannot fathom from the paper when one would have enough of a sample (conditional on a parameter value) to rely on what is essentially an estimate of the sampling distribution. This seems to contradict the fact that they only use seven summary statistics. Or it may be that the “statistic” of waiting times happens to be a vector, in which case a Kolmogorov-Smirnov distance can indeed be adopted for the distance… The fact that the grouped seven-dimensional summary statistic provides the best ABC fit is somewhat of a surprise when considering the problem enjoys a single parameter.

“However, in practice, it is often difficult to find an s(.) which is sufficient.”

Just a point that irks me in most ABC papers is to find quotes like the above, since in most models, it is easy to show that there cannot be a non-trivial sufficient statistic! As soon as one leaves the exponential family cocoon, one is doomed in this respect!!!