## ABC in Roma

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , on May 31, 2013 by xi'an

Back in Roma after my ABC week last year, for the ABC in Rome workshop! The attendance is quite in par with the sizes of the previous audiences and the program is close to my own interests—unsurprisingly since I took part to the scientific committee! Hence talks on papers that have already been discussed on the ‘Og for most of them:

(It almost sounds as if I had written the program by myself, but this is not the case, promised!) So from my own personal and egoistic perspective, the poster session was more surprising, with 18 posters ranging from theoretical extensions to applications. I actually wished it had lasted a wee bit longer as I did not have time to listen to all presenters before they vanished to the dinning room upstairs, but I appreciated very much the few exchanges I had. A fully enjoyable meeting then!!! I am definitely looking forward the next edition of ABC in [pick your capital], ABC in Sydney (2014) and ABC in Helsinki (2015) being already in the planning…

Here are my slides, just slightly updated from the previous version:

## ABC in Roma [R lab #2]

Posted in R, Statistics, University life with tags , , , , on March 2, 2012 by xi'an

Here are the R codes of the second R lab organised by Serena Arima in supplement of my lectures (now completed!). This morning I covered ABC model choice and the following example is the benchmark used in the course (and in the paper) about the impact of summary statistics. (Warning! It takes a while to run…)

```n.iter=10000
n=c(10,100,1000)
n.sims=100
prob.m1=matrix(0,nrow=n.sims,ncol=length(n))
prob.m2=prob.m1

### Simulation from Normal model
for(sims in 1:n.sims){

## True data generation from the Normal distribution and summary statistics
for(i in 1:length(n)){
y.true=rnorm(n[i],0,1)
stat.true=c(mean(y.true),median(y.true),var(y.true))

## ABC algorithm
# Sample from the prior
mu=rnorm(n.iter,0,2)
dist.m1=rep(NA,n.iter)
dist.m2=rep(NA,n.iter)

for(j in 1:n.iter){
# Data generation under both models
# Normal model
y.m1=rnorm(n[i],mu[j])
stat.m1=c(mean(y.m1),median(y.m1),var(y.m1))
# computing the distance
dist.m1[j]=sum((stat.m1-stat.true)^2)
# Laplace model
y.m2=mu[j]+sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.m2=c(mean(y.m2),median(y.m2),var(y.m2))
# computing the distance
dist.m2[j]=sum((stat.m2-stat.true)^2)
}

# select epsilon as 1% distance quantile
epsilon=quantile(c(dist.m1,dist.m2),prob=0.01)

# compute the posterior probability that data come from
# the model
prob.m1[sims,i]=sum(dist.m1<epsilon)/(2*n.iter*0.01)
}}

### Simulation from the Laplace model
for(sims in 1:n.sims){

## True data generation from the Laplace distribution and summary statistics
for(i in 1:length(n)){
y.true=sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.true=c(mean(y.true),median(y.true),var(y.true))

## ABC algorithm
# Sample from the prior
mu=rnorm(n.iter,0,2)
dist.m1=rep(NA,n.iter)
dist.m2=rep(NA,n.iter)

for(j in 1:n.iter){
# Data generation under both models
# Normal model
y.m1=rnorm(n[i],mu[j])
stat.m1=c(mean(y.m1),median(y.m1),var(y.m1))
# computing the distance
dist.m1[j]=sum((stat.m1-stat.true)^2)
# Laplace model
y.m2=mu[j]+sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.m2=c(mean(y.m2),median(y.m2),var(y.m2))
# computing the distance
dist.m2[j]=sum((stat.m2-stat.true)^2)
}

# select epsilon as 1% distance quantile
epsilon=quantile(c(dist.m1,dist.m2),prob=0.01)

# compute the posterior probability that data come from
# the model
prob.m2[sims,i]=sum(dist.m2<epsilon)/(2*n.iter*0.01)
}
}

# Visualize the results
y.true=sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.true=c(mean(y.true),median(y.true),var(y.true))

## ABC algorithm
# Sample from the prior
mu=rnorm(n.iter,0,2)
dist.m1=rep(NA,n.iter)
dist.m2=rep(NA,n.iter)

for(j in 1:n.iter){
# Data generation under both models
# Normal model
y.m1=rnorm(n[i],mu[j])
stat.m1=c(mean(y.m1),median(y.m1),var(y.m1))
# computing the distance
dist.m1[j]=sum((stat.m1-stat.true)^2)
# Laplace model
y.m2=mu[j]+sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.m2=c(mean(y.m2),median(y.m2),var(y.m2))
# computing the distance
dist.m2[j]=sum((stat.m2-stat.true)^2)
}

# select epsilon as 1% distance quantile
epsilon=quantile(c(dist.m1,dist.m2),prob=0.01)

# compute the posterior probability that data come from
# the model
prob.m2[sims,i]=sum(dist.m2<epsilon)/(2*n.iter*0.01)
}
}

# Visualize the results
y.true=sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.true=c(mean(y.true),median(y.true),var(y.true))

## ABC algorithm
# Sample from the prior
mu=rnorm(n.iter,0,2)
dist.m1=rep(NA,n.iter)
dist.m2=rep(NA,n.iter)

for(j in 1:n.iter){
# Data generation under both models
# Normal model
y.m1=rnorm(n[i],mu[j])
stat.m1=c(mean(y.m1),median(y.m1),var(y.m1))
# computing the distance
dist.m1[j]=sum((stat.m1-stat.true)^2)
# Laplace model
y.m2=mu[j]+sample(c(-1,1),n[i],rep=TRUE)*rexp(n[i],rate=sqrt(2))
stat.m2=c(mean(y.m2),median(y.m2),var(y.m2))
# computing the distance
dist.m2[j]=sum((stat.m2-stat.true)^2)
}

# select epsilon as 1% distance quantile
epsilon=quantile(c(dist.m1,dist.m2),prob=0.01)

# compute the posterior probability that data come from
# the model
prob.m2[sims,i]=sum(dist.m2<epsilon)/(2*n.iter*0.01)
}}

# Visualize the results
boxplot(data.frame(prob.m1[,1],prob.m2[,1]),
names=c("M1","M2"),main="N=10")
boxplot(data.frame(prob.m1[,2],prob.m2[,2]),
names=c("M1","M2"),main="N=100")
boxplot(data.frame(prob.m1[,3],prob.m2[,3]),
names=c("M1","M2"),main="N=1000")
```

Once again, I had a terrific time teaching this “ABC in Roma” course, and not only for the immense side benefit of enjoy Roma in a particularly pleasant weather (for late February).

## A Roma [4]

Posted in pictures, Travel, University life with tags , , , , , , , on March 1, 2012 by xi'an

The above could have been the very last picture taken with my current camera: After taking it, I let go the camera which bounced on the pavement and…disappeared inside an manhole! I then thought it was lost to the cloaca maxima, but still took a look by the manhole and the camera was sitting within reach on a pile of refuse… Even better the camera is still working and seems to have lost a dust particle that was plaguing all my pictures since X’mas… (Btw, I eventually managed to get an Ethernet cable connection! Which implies text in addition to pictures on the ‘Og in the coming days.)

Living within walking distance of La Sapienza means I can take lots of pictures in the streets around the university, enjoying the morning & evening sun… As I wrote yesterday, almost all houses look like palazzi. Except the following one, which sounds more like a remnant of the fascist era. (It was indeed built between 1937 and 1938, and is now the site of the superior council of the magistracy.)

I thus appreciate very much the double opportunity offered by teaching this ABC advanced course at La Sapienza, namely both to reach new students and to enjoy a week of Roman dolce vita…

## ABC in Roma [R lab #1]

Posted in R, Statistics, University life with tags , , , , on February 29, 2012 by xi'an

Here are the R codes of the first R lab written and ran by Serena Arima in supplement of my lectures. This is quite impressive and helpful to the students, as illustrated by the first example below (using the abc software).

```### Example 1: Conjugate model (Normal-Inverse Gamma)
### y1,y2,...,yn ~N(mu,sigma2)
### mu|sigma2 ~ N(0,sigma2), sigma2 ~IG(1/2,1/2)

library(abc)

### Iris data: sepal width of Iris Setosa
data(iris3)
y=iris3[,2,1]

### We want to obtain the following quantities
### &quot;par.sim&quot; &quot;post.mu&quot; &quot;post.sigma2&quot; &quot;stat.obs&quot; &quot;stat.sim&quot;

## STAT.OBS: stat.obs are mean and variance (log scale) of the data
mean(y)
log(var(y))
stat.obs
stat.oss=c(mean(y),log(var(y)))

### PAR.SIM: par.sim simulated values from the prior distribution
n.sim=10000
gdl=1
invsigma.sim=rchisq(n.sim,df=gdl)
sigma.sim=gdl/invsigma.sim

m=3
mu.sim=c()
for(i in 1:length(sigma.sim)){
mu.sim=c(mu.sim,rnorm(1,mean=m,sd=sqrt(((sigma.sim[i])))))}
prior.sim=data.frame(mu.sim,sigma.sim)

### STAT.SIM: for mu and sigma simulated from the prior,
### generate data from the model y ~ N(mu,sigma^2)
stat.simulated=matrix(NA,nrow=length(mu.sim),ncol=2)

for(i in 1:length(mu.sim)){
y.new=rnorm(length(y),mu.sim[i],sqrt(sigma.sim[i]))
stat.simulated[i,1]=mean(y.new)
stat.simulated[i,2]=log(var(y.new))}

### Obtain posterior distribution using ABC
post.value=abc(target=stat.oss, param=prior.sim,
sumstat=data.frame(stat.simulated),tol=0.001,method=&quot;rejection&quot;)
summary(post.value)

mu.post=posterior.values[,1]
sigma.post=posterior.values[,2]

### True values, thanks to conjugancy
post.mean.mu=(length(y)/(length(y)+1))*mean(y)
post.a.sigma=length(y)/2
post.b.sigma=0.5+0.5*sum((y-mean(y))^2)

hist(mu.post,main=&quot;Posterior distribution of mu&quot;)
abline(v=post.mean.mu,col=2,lwd=2)

hist(sigma.post,main=&quot;Posterior distribution of sigma2&quot;)
abline(v=post.b.sigma/(post.a.sigma-1),col=2,lwd=2)
```

I am having a great time teaching this “ABC in Roma” course, in particular because of the level of interaction and exchange with the participants (after, if not during, the classes).

## A Roma

Posted in pictures, R, Statistics, Travel, University life with tags , , , , on February 26, 2012 by xi'an

Today, I am going to Rome for a week, teaching my PhD course on ABC I first gave in Paris. The course takes place in La Sapienza Università di Roma, from Monday till Thursday. There will be an R lab in addition to the lectures. (I have no further item of information at the moment.) The slides have been corrected from some typos and reposted on slideshare.

## bounded normal mean

Posted in R, Statistics, University life with tags , , , , , , , , , on November 25, 2011 by xi'an

A few days ago, one of my students, Jacopo Primavera (from La Sapienza, Roma) presented his “reading the classic” paper, namely the terrific bounded normal mean paper by my friends George Casella and Bill Strawderman (1981, Annals of Statistics). Even though I knew this paper quite well, having read (and studied) it myself many times, starting in 1987 in Purdue with Mary Ellen Bock, it was a pleasure to spend another hour on it, as I came up with new perspectives and new questions. Above are my scribbled notes on the back of the [Epson] beamer documentation. One such interesting question is whether or not it is possible to devise a computer code that would [approximately] produce the support of the least favourable prior for a given bound m (in a reasonable time). Another open question is to find the limiting bounds for which a 2 point, a 3 point, &tc., support prior is the least favourable prior. This was established in Casella and Strawderman for bounds less than 1.08 and for bounds between 1.4 and 1.6, but I am not aware of other results in that direction… Here are the slides used by Jacopo: