Archive for David Draper

asynchronous distributed Gibbs sampling

Posted in Books, Statistics, Travel, University life with tags , , , , , , , on October 13, 2015 by xi'an

Alexander Terenin, Dan Simpson, and David Draper just arXived a paper on an alternative brand of Gibbs sampler, which they think can revolutionise the sampler and overcome its well-known bottlenecks. David had sent me the paper in advance and thus I had time to read it in the plane to Calgary. (This is also the very first paper I see acknowledging a pair of trousers..! With no connection whatsoever with bottlenecks!)

“Note that not all updates that are sent will be received by all other workers: because of network traffic congestion and other types of failures, a significant portion of the updates will be lost along the way.”

The approach is inherently parallel in that several “workers” (processors or graphical units) run Gibbs samplers in parallel, using their current knowledge of the system. This means they update a component of the model parameter, based on the information they have last received, and then send back this new value to the system. For physical reasons, there is not instantaneity in this transmission and thus all workers do not condition on the same “current” value, necessarily. Perceiving this algorithm as a “garden of forking paths” where each full conditional uses values picked at random from a collection of subchains (one for each worker), I can see why the algorithm should remain valid.

“Thus, the quality of this [ABC] method rises and falls with the ingenuity of the user in identifying nearly-sufficient statistics.”

It is also clear that this approach allows for any degree of parallelisation. However, it is less clear to me why this should constitute an improvement. With respect to the bottlenecks mentioned at the beginning of the paper, I do not truly see how the large data problem is bypassed. Except in cases where conditionals only depend on small parts of the data. Or why large dimensions can be more easily managed when compared with a plain Gibbs sampler or, better, parallel plain Gibbs samplers that would run on the same number of processors. (I do not think the paper runs the comparison in that manner, using instead a one-processor Gibbs sampler as its benchmark. Or less processors in the third example.) Since the forking paths repeatedly merge back at aperiodic stages, there is no multiplication or clear increase of the exploratory abilities of the sampler. Except for having competing proposed values [or even proposals] selected randomly. So maybe reaching a wee bit farther from time to time.

Conditional love [guest post]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , on August 4, 2015 by xi'an

[When Dan Simpson told me he was reading Terenin’s and Draper’s latest arXival in a nice Bath pub—and not a nice bath tub!—, I asked him for a blog entry and he agreed. Here is his piece, read at your own risk! If you remember to skip the part about Céline Dion, you should enjoy it very much!!!]

Probability has traditionally been described, as per Kolmogorov and his ardent follower Katy Perry, unconditionally. This is, of course, excellent for those of us who really like measure theory, as the maths is identical. Unfortunately mathematical convenience is not necessarily enough and a large part of the applied statistical community is working with Bayesian methods. These are unavoidably conditional and, as such, it is natural to ask if there is a fundamentally conditional basis for probability.

Bruno de Finetti—and later Richard Cox and Edwin Jaynes—considered conditional bases for Bayesian probability that are, unfortunately, incomplete. The critical problem is that they mainly consider finite state spaces and construct finitely additive systems of conditional probability. For a variety of reasons, neither of these restrictions hold much truck in the modern world of statistics.

In a recently arXiv’d paper, Alexander Terenin and David Draper devise a set of axioms that make the Cox-Jaynes system of conditional probability rigorous. Furthermore, they show that the complete set of Kolmogorov axioms (including countable additivity) can be derived as theorems from their axioms by conditioning on the entire sample space.

This is a deep and fundamental paper, which unfortunately means that I most probably do not grasp it’s complexities (especially as, for some reason, I keep reading it in pubs!). However I’m going to have a shot at having some thoughts on it, because I feel like it’s the sort of paper one should have thoughts on. Continue reading