Archive for kernel Bayes rule

kernel approximate Bayesian computation for population genetic inferences

Posted in Statistics, University life with tags , , , , on May 22, 2012 by xi'an

A new posting about ABC on arXiv by Shigeki Nakagome, Kenji Fukumizu, and Shuhei Mano entitled kernel approximate Bayesian computation for population genetic inferences argues about an improvement brought by the use of reproducing kernel Hilbert space (RKHS) perspective in ABC methodology, when compared with more standard ABC relying on a rather arbitrary choice of summary statistics and metric. However, I feel that the paper does not substantially defend this point, only using a simulation experiment to compare mean square errors. In particular, the claim of consistency is unsubstantiated, as is the counterpoint that “conventional ABC did not have consistency” (page 14) [and several papers, including the just published Read Paper by Fearnhead and Prangle, claim the opposite]. Furthermore, a considerable amount of space is taken in the paper by the description of the existing ABC algorithms, while the complete version of the new kernel ABC-RKHS algorithm is missing. In particular, the coverage of kernel Bayes is too sketchy to be comprehensible [at least to me] without additional study. Actually, I do not get the notion of kernel Bayes’ rule, which seems defined only in terms of expectations

\mathbb{E}[f(\theta)|s]=\sum_i w_i f(\theta_i),

where the weights are the ridge-like matrix

w_i=\sum_j (\mathbf{G}_S + n\epsilon_n \mathbf{I}_n)^{-1}_{ij}k(s_i,s_j)

where the parameter is generated from the prior, the data s is generated from the sampling distribution, and the matrix GS is made of the k(si,sj)‘s. The surrounding Hilbert space presentation does not seem particularly relevant, esp. in population genetics… I am also under the impression that the choice of the kernel function k(.,.) is as important as the choice of the metric in regular ABC, although this is not discussed in the paper, since it implies [among other things] the choice of a metric. The implementation uses a Gaussian kernel and an Euclidean metric, which involves assumptions on the homogeneous nature of the components of the summary statistics or of the data. Similarly, the “regularization” parameter εn needs to be calibrated and the paper is unclear about this, apparently picking the parameter that “showed the smallest MSEs” (page 10), which cannot be called a calibration. (There is a rather unimportant proposition about concentration of information on page 6 which proof relies on two densities being ordered, see top of page 7.)