## Bayesian sampling without tears

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , , , , on May 24, 2022 by xi'an

Following a question on Stack Overflow trying to replicate a figure from the paper written by Alan Gelfand and Adrian Smith (1990) for The American Statistician, Bayesian sampling without tears, which precedes their historical MCMC papers, I looked at the R code produced by the OP and could not spot an issue as to why their simulation did not fit the posterior produced in the paper. Which proposes acceptance-rejection and sampling-importance-resampling as two solutions to approximately simulate from the posterior. The later being illustrated by simulations from the prior being weighted by the likelihood… The illustration is made of 3 observations from the sum of two Binomials with different success probabilities, θ¹ and θ². With a Uniform prior on both.

```for (i in 1:N)
for (k in 1:3){
llh<-0
for (j in max(0,n2[k]-y[k]):min(y[k],n1[k]))
llh<-llh+choose(n1[k],j)*choose(n2[k],y[k]-j)*
theta[i,1]^j*(1-theta[i,1])^(n1[k]-j)*theta[i,2]^(y[k]-j)*
(1-theta[i,2])^(n2[k]-y[k]+j)
l[i]=l[i]*llh}
```

To double-check, I also wrote a Gibbs version:

```theta=matrix(runif(2),nrow=T,ncol=2)
x1=rep(NA,3)
for(t in 1:(T-1)){
for(j in 1:3){
a<-max(0,n2[j]-y[j]):min(y[j],n1[j])
x1[j]=sample(a,1,
prob=choose(n1[j],a)*choose(n2[j],y[j]-a)*
theta[t,1]^a*(1-theta[t,1])^(n1[j]-a)*
theta[t,2]^(y[j]-a)*(1-theta[t,2])^(n2[j]-y[j]+a)
)}
theta[t+1,1]=rbeta(1,sum(x1)+1,sum(n1)-sum(x1)+1)
theta[t+1,2]=rbeta(1,sum(y)-sum(x1)+1,sum(n2)-sum(y)+sum(x1)+1)}
```

which did not show any difference with the above. Nor with the likelihood surface.

## divide & reconquer

Posted in Books, Statistics, University life with tags , , , , , , , , , , on February 5, 2018 by xi'an

Qi Liu, Anindya Bhadra, and William Cleveland from Purdue have arXived a paper entitled Divide and Recombine for Large and Complex Data: Model Likelihood Functions using MCMC. Which is a variation on the earlier divide & … papers attempting at handling large datasets. The beginning is quite similar to these earlier papers in that the likelihood is split into sub-likelihoods, approximated from MCMC samples and recombined into an approximate full likelihood. As in for instance Scott et al. one approximation use for the subsample is to replace the likelihood with a Normal approximation, or a skew Normal generalisation, which remains  a limited choice for heavy tailed likelihoods. Producing a Normal and skew-Normal approximation for the whole [data] likelihood, respectively. If I understand correctly, these approximations are missing a normalising constant to bring them to scale with the true likelihood, which I do not completely understand as the likelihood only needs to be defined up to a [constant] constant for most purposes, including Bayesian ones. The  method of estimation of this constant proposed therein is called the contour probability algorithm and it consists in using a highest density region to compare a likelihood and its approximation. (Nothing to do with our adaptation of Gelfand and Dey (1994) based on HPDs, with Darren Wright. Nor with nested sampling.) Returning a form of qq-plot. This is rather exploratory, while hardly addressing the issue of the precision of such approximations and the resolution of conflicting proposals. And the comparison with all these other recent proposals for splitting likelihoods into manageable bits (proposals that are mentioned in the final section, including our recentering scheme with my student Changye Wu).