Archive for Uncertainty in Artificial Intelligence

minibatch acceptance for Metropolis-Hastings

Posted in Books, Statistics with tags , , , , , on January 12, 2018 by xi'an

An arXival that appeared last July by Seita, Pan, Chen, and Canny, and that relates to my current interest in speeding up MCMC. And to 2014 papers by  Korattikara et al., and Bardenet et al. Published in Uncertainty in AI by now. The authors claim that their method requires less data per iteration than earlier ones…

“Our test is applicable when the variance (over data samples) of the log probability ratio between the proposal and the current state is less than one.”

By test, the authors mean a mini-batch formulation of the Metropolis-Hastings acceptance ratio in the (special) setting of iid data. First they use Barker’s version of the acceptance probability instead of Metropolis’. Second, they use a Gaussian approximation to the distribution of the logarithm of the Metropolis ratio for the minibatch, while the Barker acceptance step corresponds to comparing a logistic perturbation of the logarithm of the Metropolis ratio against zero. Which amounts to compare the logarithm of the Metropolis ratio for the minibatch, perturbed by a logistic minus Normal variate. (The cancellation of the Normal in eqn (13) is a form of fiducial fallacy, where the Normal variate has two different meanings. In other words, the difference of two Normal variates is not equal to zero.) However, the next step escapes me as the authors seek to optimise the distribution of this logistic minus Normal variate. Which I thought was uniquely defined as such a difference. Another constraint is that the estimated variance of the log-likelihood ratio gets below one. (Why one?) The argument is that the average of the individual log-likelihoods is approximately Normal by virtue of the Central Limit Theorem. Even when randomised. While the illustrations on a Gaussian mixture and on a logistic regression demonstrate huge gains in computational time, it is unclear to me to which amount one can trust the approximation for a given model and sample size…