inference in Kingman’s coalescent with pMCMC
As I was checking the recent stat postings on arXiv, I noticed the paper by Chen and Xie entitled inference in Kingman’s coalescent with pMCMC. (And surprisingly deposited in the machine learning subdomain.) The authors compare a pMCMC implementation for Kingman’s coalescent with importance sampling (à la Stephens & Donnelly), regular MCMC and SMC. The specifics of their pMCMC algorithm is that they simulate the coalescent times conditional on the tree structure and the tree structure conditional on the coalescent times (via SMC). The results reported in the paper consider up to five loci and agree with earlier experiments showing poor performances of MCMC algorithms (based on the LAMARC software and apparently using independent proposals). They show similar performances between importance sampling and pMCMC. While I find this application of pMCMC interesting, I wonder at the generality of the approach: when I was introduced to ABC techniques, the motivation was that importance sampling was deteriorating very quickly with the number of parameters. Here it seems the authors only considered one parameter θ. I wonder what happens when the number of parameters increases. And how pMCMC would then compare with ABC.
May 22, 2013 at 1:25 am
Surely you can make a PMCMC scheme that doesn’t degrade. Take an infinite dimensional MCMC scheme coupled with a similar SMC scheme. (This should be possible if there is structure in the parameter posterior, which there must be eventually if the model is identifiable) Similar with exact-approximate methods – you can find (unrealistic) asymptotic regimes for which the dimension of the parameter space for essentially infinite dimensional spatial problems doesn’t affect the performance of the scheme. (I think… The trick should be to combine a dimension-independt MCMC scheme with a dimension independent approximation to the likelihood. It’s certainly possible, but only in really dumb situations, like outfill asymptotics for spatial problems….)
But (theoretical) correctness, asymptotic exactness and dimension independence aren’t necessarily useful things… I would love it if someone could prove that, given a random sequence, you can construct a pseudo-marginal mcmc scheme that is exact for an arbitrary distribution that is coupled with the random sequence for an arbitrary time T (in expectation, maybe)… It would be one hell of a companion to the paper that Geoff Nicholls, Colin Fox and Alexis Muir-Watt wrote!