## ABC in Roma [R lab #1]

Posted in R, Statistics, University life with tags , , , , on February 29, 2012 by xi'an

Here are the R codes of the first R lab written and ran by Serena Arima in supplement of my lectures. This is quite impressive and helpful to the students, as illustrated by the first example below (using the abc software).

### Example 1: Conjugate model (Normal-Inverse Gamma)
### y1,y2,...,yn ~N(mu,sigma2)
### mu|sigma2 ~ N(0,sigma2), sigma2 ~IG(1/2,1/2)

library(abc)

### Iris data: sepal width of Iris Setosa
data(iris3)
y=iris3[,2,1]

### We want to obtain the following quantities
### &quot;par.sim&quot; &quot;post.mu&quot; &quot;post.sigma2&quot; &quot;stat.obs&quot; &quot;stat.sim&quot;

## STAT.OBS: stat.obs are mean and variance (log scale) of the data
mean(y)
log(var(y))
stat.obs
stat.oss=c(mean(y),log(var(y)))

### PAR.SIM: par.sim simulated values from the prior distribution
n.sim=10000
gdl=1
invsigma.sim=rchisq(n.sim,df=gdl)
sigma.sim=gdl/invsigma.sim

m=3
mu.sim=c()
for(i in 1:length(sigma.sim)){
mu.sim=c(mu.sim,rnorm(1,mean=m,sd=sqrt(((sigma.sim[i])))))}
prior.sim=data.frame(mu.sim,sigma.sim)

### STAT.SIM: for mu and sigma simulated from the prior,
### generate data from the model y ~ N(mu,sigma^2)
stat.simulated=matrix(NA,nrow=length(mu.sim),ncol=2)

for(i in 1:length(mu.sim)){
y.new=rnorm(length(y),mu.sim[i],sqrt(sigma.sim[i]))
stat.simulated[i,1]=mean(y.new)
stat.simulated[i,2]=log(var(y.new))}

### Obtain posterior distribution using ABC
post.value=abc(target=stat.oss, param=prior.sim,
sumstat=data.frame(stat.simulated),tol=0.001,method=&quot;rejection&quot;)
summary(post.value)

mu.post=posterior.values[,1]
sigma.post=posterior.values[,2]

### True values, thanks to conjugancy
post.mean.mu=(length(y)/(length(y)+1))*mean(y)
post.a.sigma=length(y)/2
post.b.sigma=0.5+0.5*sum((y-mean(y))^2)

hist(mu.post,main=&quot;Posterior distribution of mu&quot;)
abline(v=post.mean.mu,col=2,lwd=2)

hist(sigma.post,main=&quot;Posterior distribution of sigma2&quot;)
abline(v=post.b.sigma/(post.a.sigma-1),col=2,lwd=2)

I am having a great time teaching this “ABC in Roma” course, in particular because of the level of interaction and exchange with the participants (after, if not during, the classes).

## A Roma [3]

Posted in pictures, Travel on February 29, 2012 by xi'an

## Elsevier’s letter to mathematicians

Posted in Books, University life on February 29, 2012 by xi'an

Just like Timothy Gower and thousands of others, I have received in the past days an email from Elsevier about their pricing, agressive marketing policy towards University libraries, and support of the Research Work Act. This email was an answer to the growing action of the mathematical community to boycott Elsevier through [not] publishing/refereeing/aediting in Elsevier’s journals, an action started by an earlier blog of Timothy Gower. He has now posted his extensive reaction to this email and, as it perfectly fits mine, I see no point in writing another post (esp. with very limited posting abilities this week in Roma…)

## winter trees (3)

Posted in pictures, Travel, University life with tags , , , , on February 28, 2012 by xi'an

## A Roma [2]

Posted in pictures, Travel on February 28, 2012 by xi'an

## dimension reduction in ABC [a review's review]

Posted in Statistics, University life with tags , , , , , , , , , , , on February 27, 2012 by xi'an

What is very apparent from this study is that there is no single `best’ method of dimension reduction for ABC.

Michael Blum, Matt Nunes, Dennis Prangle and Scott Sisson just posted on arXiv a rather long review of dimension reduction methods in ABC, along with a comparison on three specific models. Given that the choice of the vector of summary statistics is presumably the most important single step in an ABC algorithm and as selecting too large a vector is bound to fall victim of the dimension curse, this is a fairly relevant review! Therein, the authors compare regression adjustments à la Beaumont et al.  (2002), subset selection methods, as in Joyce and Marjoram (2008), and projection techniques, as in Fearnhead and Prangle (2012). They add to this impressive battery of methods the potential use of AIC and BIC. (Last year after ABC in London I reported here on the use of the alternative DIC by Francois and Laval, but the paper is not in the bibliography, I wonder why.) An argument (page 22) for using AIC/BIC is that either provides indirect information about the approximation of p(θ|y) by p(θ|s); this does not seem obvious to me.

The paper also suggests a further regularisation of Beaumont et al.  (2002) by ridge regression, although L1 penalty à la Lasso would be more appropriate in my opinion for removing extraneous summary statistics. (I must acknowledge never being a big fan of ridge regression, esp. in the ad hoc version à la Hoerl and Kennard, i.e. in a non-decision theoretic approach where the hyperparameter λ is derived from the data by X-validation, since it then sounds like a poor man’s Bayes/Stein estimate, just like BIC is a first order approximation to regular Bayes factors… Why pay for the copy when you can afford the original?!) Unsurprisingly, ridge regression does better than plain regression in the comparison experiment when there are many almost collinear summary statistics, but an alternative conclusion could be that regression analysis is not that appropriate with  many summary statistics. Indeed, summary statistics are not quantities of interest but data summarising tools towards a better approximation of the posterior at a given computational cost… (I do not get the final comment, page 36, about the relevance of summary statistics for MCMC or SMC algorithms: the criterion should be the best approximation of p(θ|y) which does not depend on the type of algorithm.)

I find it quite exciting to see the development of a new range of ABC papers like this review dedicated to a better derivation of summary statistics in ABC, each with different perspectives and desideratas, as it will help us to understand where ABC works and where it fails, and how we could get beyond ABC…

## A Roma

Posted in pictures, R, Statistics, Travel, University life with tags , , , , on February 26, 2012 by xi'an

Today, I am going to Rome for a week, teaching my PhD course on ABC I first gave in Paris. The course takes place in La Sapienza Università di Roma, from Monday till Thursday. There will be an R lab in addition to the lectures. (I have no further item of information at the moment.) The slides have been corrected from some typos and reposted on slideshare.