## likelihood-free inference in high-dimensional models

“…for a general linear model (GLM), a single linear function is a sufficient statistic for each associated parameter…”

The recently arXived paper “Likelihood-free inference in high-dimensional models“, by Kousathanas et al. (July 2015), proposes an ABC resolution of the dimensionality curse [when the dimension of the parameter and of the corresponding summary statistics] by turning Gibbs-like and by using a component-by-component ABC-MCMC update that allows for low dimensional statistics. In the (rare) event there exists a conditional sufficient statistic for each component of the parameter vector, the approach is just as justified as when using a generic ABC-Gibbs method based on the whole data. Otherwise, that is, when using a non-sufficient estimator of the corresponding component (as, e.g., in a generalised [not general!] linear model), the approach is less coherent as there is no joint target associated with the Gibbs moves. One may therefore wonder at the convergence properties of the resulting algorithm. The only safe case [in dimension 2] is when one of the restricted conditionals does not depend on the other parameter. Note also that each Gibbs step a priori requires the simulation of a new pseudo-dataset, which may be a major imposition on computing time. And that setting the tolerance for each parameter is a delicate calibration issue because in principle the tolerance should depend on the other component values. I ran a comparative experiment on a toy normal target, using either empirical mean and variance (blue), or empirical median and mad (brown), with little difference in the (above) output. Especially when considering that I set the tolerance somewhat arbitrarily. This could be due to the fact that the pairs are quite similar in terms of their estimation properties. However, I then realised that the empirical variance is not sufficient for the variance conditional on the mean parameter. I looked at the limiting case (with zero tolerance), which amounts to simulating σ first and then μ given σ, and ran a (Gibbs and iid) simulation. The difference, as displayed below (red standing for the exact ABC case), is not enormous, even though it produces a fatter tail in μ. Note the interesting feature that I cannot produce the posterior distribution of the parameters given the median and mad statistics. Which is a great introduction to ABC!

N=10 data=rnorm(N,mean=3,sd=.5) #ABC with insufficient statistics medata=median(data) madata=mad(data) varamad=rep(0,100) for (i in 1:100) varamad[i]=mad(sample(data,N,rep=TRUE)) tol=c(.01*mad(data),.05*mad(varamad)) T=1e6 mu=rep(median(data),T) sig=rep(mad(data),T) for (t in 2:T){ mu[t]=rnorm(1) psudo=rnorm(N,mean=mu[t],sd=sig[t-1]) if (abs(medata-median(psudo))>tol[1]) mu[t]=mu[t-1] sig[t]=1/rexp(1) psudo=rnorm(N,mean=mu[t],sd=sig[t]) if (abs(madata-mad(psudo))>tol[2]) sig[t]=sig[t-1] } #ABC with more sufficient statistics meaata=mean(data) sddata=sd(data) varamad=rep(0,100) for (i in 1:100) varamad[i]=sd(sample(data,N,rep=TRUE)) tol=c(.1*sd(data),.1*sd(varamad)) for (t in 2:T){ mu[t]=rnorm(1) psudo=rnorm(N,mean=mu[t],sd=sig[t-1]) if (abs(meaata-mean(psudo))>tol[1]) mu[t]=mu[t-1] sig[t]=1/rexp(1) psudo=rnorm(N,mean=mu[t],sd=sig[t]) if (abs(sddata-sd(psudo))>tol[2]) sig[t]=sig[t-1] } #MCMC with false sufficient sig=1/sqrt(rgamma(T,shape=.5*N,rate=1+.5*var(data))) mu=rnorm(T,mean(data)/(1+sig^2/N),sd=1/sqrt(1+N/sig^2)))

September 2, 2015 at 4:40 pm

Unfortunately, Facebook groups always require a log-in, even if they are public (the ABC Facebook group is now public, by the way).

September 4, 2015 at 10:20 am

Thanks for trying, and no worries!

September 1, 2015 at 5:48 pm

[…] likelihood-free inference in high-dimensional models […]

September 1, 2015 at 2:29 am

I commented on this paper a few weeks ago on the `Approximate Bayesian Computation’ facebook page (yes there is one!):

“So this idea (using low dimensional *conditional* summary stats to do marginal updates in MCMC/Gibbs samplers) is fine if you have exact conditional sufficient stats, as then you get the true posterior [for a zero tolerance]. However, if you do this with insufficient statistics for any of the marginal updates, you run the risk of defining a joint distribution by potentially inconsistent marginal distributions. Which means that your implied joint distribution may not be unique, or may not even exist! As we are doing an ABC approximation by accepting samples within an epsilon-ball at each stage, even this perturbation might be enough to produce inconsistent marginals in some (or even general) settings …”

September 1, 2015 at 5:49 am

Thanks, Scott: could you provide us with the Facebook link?

September 2, 2015 at 1:16 am

I think this is the URL https://www.facebook.com/groups/921246634574876/ but you may need to log in first.

The group is organised by j-ISBA. If you want to join, just request to be a member and a j-ISBA admin will sort you out.

September 2, 2015 at 8:55 am

Thanks, Scott; as I closed my Facebook account more than a year ago, I cannot indeed get access to this group.

September 2, 2015 at 11:12 am

It should be possible to change it to an open group that non-members of facebook can read. I’ll ask j-ISBA if there’s a particular reason they set it up as a closed group.

September 2, 2015 at 1:05 pm

Thanks, Dennis!

September 2, 2015 at 5:50 pm

Sorry it turns out that doesn’t work for facebook groups, only pages. It’ll just have to remain mysterious to non-facebook users! (It’s mainly links to recent papers, but there’s also a few other resources and short discussions.)