Archive for Wright-Fisher model

Barker at the Bernoulli factory

Posted in Books, Statistics with tags , , , , , , , on October 5, 2017 by xi'an

Yesterday, Flavio Gonçalves, Krzysztof Latuszýnski, and Gareth Roberts (Warwick) arXived a paper on Barker’s algorithm for Bayesian inference with intractable likelihoods.

“…roughly speaking Barker’s method is at worst half as good as Metropolis-Hastings.”

Barker’s acceptance probability (1965) is a smooth if less efficient version of Metropolis-Hastings. (Barker wrote his thesis in Adelaide, in the Mathematical Physics department. Most likely, he never interacted with Ronald Fisher, who died there in 1962) This smoothness is exploited by devising a Bernoulli factory consisting in a 2-coin algorithm that manages to simulate the Bernoulli variable associated with the Barker probability, from a coin that can simulate Bernoulli’s with probabilities proportional to [bounded] π(θ). For instance, using a bounded unbiased estimator of the target. And another coin that simulates another Bernoulli on a remainder term. Assuming the bound on the estimate of π(θ) is known [or part of the remainder term]. This is a neat result in that it expands the range of pseudo-marginal methods (and resuscitates Barker’s formula from oblivion!). The paper includes an illustration in the case of the far-from-toyish Wright-Fisher diffusion. [Making Fisher and Barker meeting, in the end!]

advanced computational methods for complex models in Biology [talk]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on September 29, 2016 by xi'an

St Pancras. London, Jan. 26, 2012

Here are the slides of the presentation I gave at the EPSRC Advanced Computational methods for complex models in Biology at University College London, last week. Introducing random forests as proper summaries for both model choice and parameter estimation (with considerable overlap with earlier slides, obviously!). The other talks of that highly interesting day on computational Biology were mostly about ancestral graphs, using Wright-Fisher diffusions for coalescents, plus a comparison of expectation-propagation and ABC on a genealogy model by Mark Beaumont and the decision theoretic approach to HMM order estimation by Chris Holmes. In addition, it gave me the opportunity to come back to the Department of Statistics at UCL more than twenty years after my previous visit, at a time when my friend Costas Goutis was still there. And to realise it had moved from its historical premises years ago. (I wonder what happened to the two staircases built to reduce frictions between Fisher and Pearson if I remember correctly…)

ABC à l’X [back]

Posted in Statistics, University life with tags , , , , on February 9, 2011 by xi'an

“It is difficult to ensure that such sophisticated battles against the ‘ε-dilemma’ that arise in the simulation based inferential approaches of ABC and ALC do not confound the true posterior.” Shainudin et al., 2010

It was a very interesting talk that took place at Polytechnique last afternoon. 9Although one could argue that the title was misleading in that ABC was never truly used, except as a scarecrow!) Indeed, Razeesh Shainudin gave a lively one-hour lecture where he showed that, for the standard Wright-Fisher coalescent model, the exact likelihood of the site-frequency-spectrum statistic (SFS). Ìt would have required much more time to cover the dense material contained in the paper he coauthored on this topic (and to appear in the Bulletin of Mathematical Biology, Algebraic Biology Special Edition), but his message was quite clear, namely that a graph analysis of the distribution of the SFS statistic was permitting a closed-form representation of the likelihood. And similarly for some linear combinations of the SFS that allowed for [great!] Kemeny and Snell‘s (1968) lumpability criterion to apply, namely for an impoverished Markov chain defined on an aggregated state space to remain Markov. I do see the direct consequence on the quality of ABC (except when conducting Monte Carlo experiments to compare both outputs) and I have not thought long enough to spot the impact on our current research, but appreciated very much the intuition given in the talk. The idea beyond the method seems to be that improving the support of the importance sampling distribution in order to remove parameter values that could not lead to the real data (SFS) provides a clear efficiency improvement)