Archive for the Statistics Category

minibatch acceptance for Metropolis-Hastings

Posted in Books, Statistics with tags , , , , , on January 12, 2018 by xi'an

An arXival that appeared last July by Seita, Pan, Chen, and Canny, and that relates to my current interest in speeding up MCMC. And to 2014 papers by  Korattikara et al., and Bardenet et al. Published in Uncertainty in AI by now. The authors claim that their method requires less data per iteration than earlier ones…

“Our test is applicable when the variance (over data samples) of the log probability ratio between the proposal and the current state is less than one.”

By test, the authors mean a mini-batch formulation of the Metropolis-Hastings acceptance ratio in the (special) setting of iid data. First they use Barker’s version of the acceptance probability instead of Metropolis’. Second, they use a Gaussian approximation to the distribution of the logarithm of the Metropolis ratio for the minibatch, while the Barker acceptance step corresponds to comparing a logistic perturbation of the logarithm of the Metropolis ratio against zero. Which amounts to compare the logarithm of the Metropolis ratio for the minibatch, perturbed by a logistic minus Normal variate. (The cancellation of the Normal in eqn (13) is a form of fiducial fallacy, where the Normal variate has two different meanings. In other words, the difference of two Normal variates is not equal to zero.) However, the next step escapes me as the authors seek to optimise the distribution of this logistic minus Normal variate. Which I thought was uniquely defined as such a difference. Another constraint is that the estimated variance of the log-likelihood ratio gets below one. (Why one?) The argument is that the average of the individual log-likelihoods is approximately Normal by virtue of the Central Limit Theorem. Even when randomised. While the illustrations on a Gaussian mixture and on a logistic regression demonstrate huge gains in computational time, it is unclear to me to which amount one can trust the approximation for a given model and sample size…

the curse of large dimension [teaser]

Posted in Books, pictures, Statistics, University life with tags , , , , , , on January 11, 2018 by xi'an

the shockwave rider [book review]

Posted in Statistics with tags , , , , , , on January 11, 2018 by xi'an

I ordered this book from John Brunner when I found this was the precursor to Neuromancer and the subsequent cyberpunk literature. And after reading it during the Xmas break I am surprised it is not more well-known. Indeed, the plot, the style, the dystopian society in The Shockwave Rider all are highly original, and more “intellectual” than successors like Neuromancer or Snow Crash. Reading this 1975 book forty years later also reveals its premonitory features, from inventing the concept of computer worm (along with a pretty accurate description), to forecasting (or being aware of plans for) cell-phones, the Net, the move to electric cars, and Wikipedia, with the consequence of being always visible for whoever controls the network. The characters are flawed in that they are too charicaturesque, but this is somewhat secondary since the main appeal of the book is to discuss the features of an all-connected world. And the way to recover power to the people against a government controlling the network and the associated data. The time being the 1970’s the resolution via a hippie commune in Northern California (like Eureka!) is a bit outdated and definitely “rosy”, and does not foresee the issue of “digital democracy” being threatened by a strong polarisation into estranged communities, but I still enjoyed the book tremendously. (As a bonus, I got the first edition of the book at a ridiculous price! With this somewhat outdated cover.)

on confidence distributions

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , on January 10, 2018 by xi'an

As Regina Liu gave her talk at ISI this morning on fusion learning and confidence distributions, this led me to think anew about this strange notion of confidence distributions, building a distribution on the parameter space without a prior to go with it, implicitly or explicitly, and vaguely differing from fiducial inference. (As an aside, the Wikipedia page on confidence distributions is rather heavily supporting the concept and was primarily written by someone from Rutgers, where the modern version was developed. [And as an aside inside the aside, Schweder and Hjort’s book is sitting in my office, waiting for me!])

Recall that a confidence distribution is a sample dependent distribution on the parameter space, which is uniform U(0,1) [in the sample] at the “true” value of the parameter. Used thereafter as a posterior distribution. (Again, almost always without a prior to go with it. Which is an incoherence from a probabilistic perspective. not mentioning the issue of operating without a pre-defined dominating measure. This measure issue is truly bothering me!) This seems to include fiducial distributions based on a pivot, unless I am confused. As noted in the review by Nadarajah et al. Moreover, the concept of creating a pseudo-posterior out of an existing (frequentist) confidence interval procedure to create a new (frequentist) procedure does not carry an additional validation per se, as it clearly depends on the choice of the initialising procedure. (Not even mentioning the lack of invariance and the intricacy of multidimensional extensions.)

ABC forecasts

Posted in Statistics, Books, pictures with tags , , , , , , , , on January 9, 2018 by xi'an

My friends and co-authors David Frazier, Gael Martin, Brendan McCabe, and Worapree Maneesoonthorn arXived a paper on ABC forecasting at the turn of the year. ABC prediction is a natural extension of ABC inference in that, provided the full conditional of a future observation given past data and parameters is available but the posterior is not, ABC simulations of the parameters induce an approximation of the predictive. The paper thus considers the impact of this extension on the precision of the predictions. And argues that it is possible that this approximation is preferable to running MCMC in some settings. A first interesting result is that using ABC and hence conditioning on an insufficient summary statistic has no asymptotic impact on the resulting prediction, provided Bayesian concentration of the corresponding posterior takes place as in our convergence paper under revision.

“…conditioning inference about θ on η(y) rather than y makes no difference to the probabilistic statements made about [future observations]”

The above result holds both in terms of convergence in total variation and for proper scoring rules. Even though there is always a loss in accuracy in using ABC. Now, one may think this is a direct consequence of our (and others) earlier convergence results, but numerical experiments on standard time series show the distinct feature that, while the [MCMC] posterior and ABC posterior distributions on the parameters clearly differ, the predictives are more or less identical! With a potential speed gain in using ABC, although comparing parallel ABC versus non-parallel MCMC is rather delicate. For instance, a preliminary parallel ABC could be run as a burnin’ step for parallel MCMC, since all chains would then be roughly in the stationary regime. Another interesting outcome of these experiments is a case when the summary statistics produces a non-consistent ABC posterior, but still leads to a very similar predictive, as shown on this graph.This unexpected accuracy in prediction may further be exploited in state space models, towards producing particle algorithms that are greatly accelerated. Of course, an easy objection to this acceleration is that the impact of the approximation is unknown and un-assessed. However, such an acceleration leaves room for multiple implementations, possibly with different sets of summaries, to check for consistency over replicates.

the definitive summer school poster

Posted in pictures, Statistics, University life with tags , , , , , , , , , , , , on January 8, 2018 by xi'an

morning devotions at Dakshineswar [jatp]

Posted in Statistics with tags , , , , , , , , on January 5, 2018 by xi'an