Bayesian computation with empirical likelihood and no A
We just resubmitted our paper to PNAS about using empirical likelihood for conducting Bayesian computation. Although this is an approximation as well, we removed the A (for approximation) from the title and from the name of the method, BCel, to comply with a referee’s request and also account for several comments during our seminars that this was not ABC! We can see the point in those comments, namely that ABC is understood as a corpus of methods that rely on the simulation of pseudo-datasets to compensate for the missing likelihood, while empirical likelihood stands as another route bypassing this difficulty… I keep my fingers crossed that this ultimate revision is convincing enough for the PNAS board!
Coincidentally, Jean-Pierre Florens came to give a (Malinvaud) seminar at CREST today about semi-parametric Bayesian modelling, mixing Gaussian process priors with generalised moment conditions. This was a fairly involved talk with a lot of technical details about RKHS spaces and a mix of asymptotics and conjugate priors (somewhat empirical Bayesianish in spirit!) In a sense, it was puzzling because the unknown distribution was modelled conditional on an unknown parameter, θ, which itself was a function of this distribution. It was however quite interesting in that it managed to mix Gaussian process priors with some sort of empirical likelihood (or GMM). Furthermore, in a sort of antithesis to our approach with empirical likelihood, Florens and Simoni had a plethora of moment restrictions they called over-identification and used this feature to improve the estimation of the underlying density. There were also connections with Fukumizu et al. kernel Bayes’ rule perspective, even though I am not clear about the later. I also got lost here by the representation of the data as a point in an Hilbert space, thanks to a convolution step. (The examples involved orthogonal polynomials like Lagrange’s or Hermitte’s, which made sense as the data was back to a finite dimension!) Once again, the most puzzling thing is certainly over-identification: in an empirical likelihood version, it would degrade the quality of the approximation by peaking more and more the approximation. It does not appear to cause such worries in Florens’ and Simoni’s perspective.