## Archive for scaling

## probability comparisons

Posted in Books, Kids, pictures, Statistics with tags calibration, comics, frequentism, scaling, xkcd on November 6, 2020 by xi'an## my likelihood is dominating my prior [not!]

Posted in Kids, Statistics with tags Bayesian inference, cross validated, likelihood function, Likelihood Principle, magnitude, scaling on August 29, 2019 by xi'an**A**n interesting misconception read on X validated today, with a confusion between the absolute value of the likelihood function and its variability. Which I have trouble explaining except possibly by the extrapolation from the discrete case and a confusion between the probability density of the data [scaled as a probability] and the likelihood function [scale-less]. I also had trouble convincing the originator of the question of the irrelevance of the scale of the likelihood *per se*, even when demonstrating that |đș| could vanish from the posterior with no consequence whatsoever. It is only when I thought of the case when the likelihood is constant in đ that I managed to make my case.

## revised empirical HMC

Posted in Statistics, University life with tags eHMC, github, Hamiltonian Monte Carlo, leapfrog integrator, NUTS, Rao-Blackwellisation, revision, scaling, STAN on March 12, 2019 by xi'an**F**ollowing the informed and helpful comments from Matt Graham and Bob Carpenter on our eHMC paper [arXival] last month, we produced a revised and re-arXived version of the paper based on new experiments ran by Changye Wu and Julien Stoehr. Here are some quick replies to these comments, reproduced for convenience. *(Warning: this is a loooong post, much longer than usual.)* Continue reading

## scalable Metropolis-Hastings

Posted in Books, Statistics, Travel with tags delayed acceptance, Fukui-Todo procedure, Hamiltonian Monte Carlo, Langevin MCMC algorithm, PDMP, scalable MCMC, scaling, Taylor expansion, thinning, University of Oxford on February 12, 2019 by xi'an**A**mong the flury of arXived papers of last week (414!), including a fair chunk of papers submitted to ICML 2019, I spotted one entry by Cornish et al. on scalable Metropolis-Hastings, which Arnaud Doucet had mentioned to me yesterday when in Oxford. The paper builds on the delayed acceptance paper we wrote with Marco BanterlĂ©, Clara Grazian and Anthony Lee, itself relying on a factorisation decomposition of the likelihood, combined with control variate accelerating techniques. The factorisation of both the target and the proposal allows for a (less efficient) Metropolis-Hastings acceptance ratio that is the product

of individual Metropolis-Hastings acceptance ratios, but which allows for quicker rejection if one of the probabilities in the product is small, because the corresponding Bernoulli draw is zero with high probability. One advance made in Michel et al. (2017) [which I doubly missed] is that subsampling is achievable by thinning (as in PDMPs, where these authors have been quite active) through an algorithm of Shantikumar (1985) [described in Devroye’s bible]. Provided each Metropolis-Hastings probability can be lower bounded:

by a term where the transition *Ï* does not depend on the index *i* in the product. The computing cost of the thinning process thus depends on the efficiency of the subsampling, namely whether or not the (Poisson) number of terms is much smaller than m, number of terms in the product. A neat trick in the current paper that extends the the Fukui-Todo procedure is to switch to the original Metropolis-Hastings when the overall lower bound is too small, recovering the geometric ergodicity of this original if it holds (**Theorem 2.1**). Another neat remark is that when using the naĂŻve factorisation as the product of the n individual likelihoods, the resulting algorithm is sort of doomed as n grows, even with an optimal scaling of the proposals. To achieve scalability, the authors introduce a Taylor (i.e., Gaussian) approximation to each local target in the product and start the acceptance decomposition by using the resulting overall Gaussian approximation. Meaning that the remaining product is now made of ratios of targets over their local Taylor approximations, hence most likely close to one. And potentially lower-bounded by the remainder term in the Taylor expansion. Leading to the conclusion that, when everything goes well, meaning that the Taylor expansions can be conducted and the bounds derived for the appropriate expansion, the order of the Poisson scale is O(1/ân)..! The proposal for the Metropolis-Hastings move is actually tuned to the Gaussian approximation, appearing as a variant of the Langevin move or more exactly a discretization of an Hamiltonian move. Obviously, I cannot judge of the complexity in implementing this new scheme from just reading the paper, but this development on the split target is definitely an exciting prospect for handling huge datasets and their friends!