## useR! 2022 [all virtual]

Posted in R, Statistics, University life with tags , , , , on May 27, 2022 by xi'an

## Bayesian sampling without tears

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , , , , on May 24, 2022 by xi'an

Following a question on Stack Overflow trying to replicate a figure from the paper written by Alan Gelfand and Adrian Smith (1990) for The American Statistician, Bayesian sampling without tears, which precedes their historical MCMC papers, I looked at the R code produced by the OP and could not spot an issue as to why their simulation did not fit the posterior produced in the paper. Which proposes acceptance-rejection and sampling-importance-resampling as two solutions to approximately simulate from the posterior. The later being illustrated by simulations from the prior being weighted by the likelihood… The illustration is made of 3 observations from the sum of two Binomials with different success probabilities, θ¹ and θ². With a Uniform prior on both.

for (i in 1:N)
for (k in 1:3){
llh<-0
for (j in max(0,n2[k]-y[k]):min(y[k],n1[k]))
llh<-llh+choose(n1[k],j)*choose(n2[k],y[k]-j)*
theta[i,1]^j*(1-theta[i,1])^(n1[k]-j)*theta[i,2]^(y[k]-j)*
(1-theta[i,2])^(n2[k]-y[k]+j)
l[i]=l[i]*llh}


To double-check, I also wrote a Gibbs version:

theta=matrix(runif(2),nrow=T,ncol=2)
x1=rep(NA,3)
for(t in 1:(T-1)){
for(j in 1:3){
a<-max(0,n2[j]-y[j]):min(y[j],n1[j])
x1[j]=sample(a,1,
prob=choose(n1[j],a)*choose(n2[j],y[j]-a)*
theta[t,1]^a*(1-theta[t,1])^(n1[j]-a)*
theta[t,2]^(y[j]-a)*(1-theta[t,2])^(n2[j]-y[j]+a)
)}
theta[t+1,1]=rbeta(1,sum(x1)+1,sum(n1)-sum(x1)+1)
theta[t+1,2]=rbeta(1,sum(y)-sum(x1)+1,sum(n2)-sum(y)+sum(x1)+1)}


which did not show any difference with the above. Nor with the likelihood surface.

## 34ου Πανελληνίου Συνεδρίου Στατιστικής

Posted in Statistics, Travel, University life on May 22, 2022 by xi'an

## Another harmonic mean

Posted in Books, Statistics, University life with tags , , , , , , , , on May 21, 2022 by xi'an

Yet another paper that addresses the approximation of the marginal likelihood by a truncated harmonic mean, a popular theme of mine. A 2020 paper by Johannes Reich, entitled Estimating marginal likelihoods from the posterior draws through a geometric identity and published in Monte Carlo Methods and Applications.

The geometric identity it aims at exploiting is that

$m(x) = \frac{\int_A \,\text d\theta}{\int_A \pi(\theta|x)\big/\pi(\theta)f(x|\theta)\,\text d\theta}$

for any (positive volume) compact set $A$. This is exactly the same identity as in an earlier and uncited 2017 paper by Ana Pajor, with the also quite similar (!) title Estimating the Marginal Likelihood Using the Arithmetic Mean Identity and which I discussed on the ‘Og, linked with another 2012 paper by Lenk. Also discussed here. This geometric or arithmetic identity is again related to the harmonic mean correction based on a HPD region A that Darren Wraith and myself proposed at MaxEnt 2009. And that Jean-Michel and I presented at Frontiers of statistical decision making and Bayesian analysis in 2010.

In this avatar, the set A is chosen close to an HPD region, once more, with a structure that allows for an exact computation of its volume. Namely an ellipsoid that contains roughly 50% of the simulations from the posterior (rather than our non-intersecting union of balls centered at the 50% HPD points), which assumes a Euclidean structure of the parameter space (or, in other words, depends on the parameterisation)In the mixture illustration, the author surprisingly omits Chib’s solution, despite symmetrised versions avoiding the label (un)switching issues. . What I do not get is how this solution gets around the label switching challenge in that set A remains an ellipsoid for multimodal posteriors, which means it either corresponds to a single mode [but then how can a simulation be restricted to a “single permutation of the indicator labels“?] or it covers all modes but also the unlikely valleys in-between.

## evidence estimation in finite and infinite mixture models

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on May 20, 2022 by xi'an

Adrien Hairault (PhD student at Dauphine), Judith and I just arXived a new paper on evidence estimation for mixtures. This may sound like a well-trodden path that I have repeatedly explored in the past, but methinks that estimating the model evidence doth remain a notoriously difficult task for large sample or many component finite mixtures and even more for “infinite” mixture models corresponding to a Dirichlet process. When considering different Monte Carlo techniques advocated in the past, like Chib’s (1995) method, SMC, or bridge sampling, they exhibit a range of performances, in terms of computing time… One novel (?) approach in the paper is to write Chib’s (1995) identity for partitions rather than parameters as (a) it bypasses the label switching issue (as we already noted in Hurn et al., 2000), another one is to exploit  Geyer (1991-1994) reverse logistic regression technique in the more challenging Dirichlet mixture setting, and yet another one a sequential importance sampling solution à la  Kong et al. (1994), as also noticed by Carvalho et al. (2010). [We did not cover nested sampling as it quickly becomes onerous.]

Applications are numerous. In particular, testing for the number of components in a finite mixture model or against the fit of a finite mixture model for a given dataset has long been and still is an issue of much interest and diverging opinions, albeit yet missing a fully satisfactory resolution. Using a Bayes factor to find the right number of components K in a finite mixture model is known to provide a consistent procedure. We furthermore establish there the consistence of the Bayes factor when comparing a parametric family of finite mixtures against the nonparametric ‘strongly identifiable’ Dirichlet Process Mixture (DPM) model.