Archive for high dimensions
Gabriel’s talk at Warwick on optimal transport
Posted in Statistics with tags Computational Optimal Transport, CRiSM, high dimensions, machine learning, optimal transport, seminar, University of Warwick on March 4, 2020 by xi'anISBA2020 program
Posted in Kids, Statistics, Travel, University life with tags approximate Bayesian inference, Bayesian computing, Bayesian statistics, China, conference, coronavirus epidemcs, high dimensions, ISBA 2020, Kunming, nCo2019, program, variational Bayes methods, Yunnan on January 29, 2020 by xi'anThe scheduled program for ISBA 2020 is now online. And full of exciting sessions, many with computational focus. With dear hopes that the nCo2019 epidemics will have abated by then (and not solely for the sake of the conference, most obviously!). While early registration ends by 15 April, the deadline for junior travel support ends up this month. And so does the deadline for contributions.
A precursor of ABCGibbs
Posted in Books, R, Statistics with tags ABC, ABCGibbs, compatible conditional distributions, Genetics, Gibbs sampler, high dimensions, incoherent inference, incompatible conditionals, insufficiency, likelihoodfree methods, sufficient statistics on June 7, 2019 by xi'anFollowing our arXival of ABCGibbs, Dennis Prangle pointed out to us a 2016 paper by Athanasios Kousathanas, Christoph Leuenberger, Jonas Helfer, Mathieu Quinodoz, Matthieu Foll, and Daniel Wegmann, LikelihoodFree Inference in HighDimensional Model, published in Genetics, Vol. 203, 893–904 in June 2016. This paper contains a version of ABC Gibbs where parameters are sequentially simulated from conditionals that depend on the data only through small dimension conditionally sufficient statistics. I had actually blogged about this paper in 2015 but since then completely forgotten about it. (The comments I had made at the time still hold, already pertaining to the coherence or lack thereof of the sampler. I had also forgotten I had run an experiment of an exact Gibbs sampler with incoherent conditionals, which then seemed to converge to something, if not the exact posterior.)
All ABC algorithms, including ABCPaSS introduced here, require that statistics are sufficient for estimating the parameters of a given model. As mentioned above, parameterwise sufficient statistics as required by ABCPaSS are trivial to find for distributions of the exponential family. Since many population genetics models do not follow such distributions, sufficient statistics are known for the most simple models only. For more realistic models involving multiple populations or population size changes, only approximatelysufficient statistics can be found.
While Gibbs sampling is not mentioned in the paper, this is indeed a form of ABCGibbs, with the advantage of not facing convergence issues thanks to the sufficiency. The drawback being that this setting is restricted to exponential families and hence difficult to extrapolate to nonexponential distributions, as using almostsufficient (or not) summary statistics leads to incompatible conditionals and thus jeopardise the convergence of the sampler. When thinking a wee bit more about the case treated by Kousathanas et al., I am actually uncertain about the validation of the sampler. When tolerance is equal to zero, this is not an issue as it reproduces the regular Gibbs sampler. Otherwise, each conditional ABC step amounts to introducing an auxiliary variable represented by the simulated summary statistic. Since the distribution of this summary statistic depends on more than the parameter for which it is sufficient, in general, it should also appear in the conditional distribution of other parameters. At least from this Gibbs perspective, it thus relies on incompatible conditionals, which makes the conditions proposed in our own paper the more relevant.
congrats, Prof Rousseau!
Posted in Statistics with tags Advanced Grant, asymptotic Bayesian methods, department of statistics, ERC, EU, European Union, high dimensions, Judith Rousseau, University of Oxford on April 4, 2019 by xi'andistributed posteriors
Posted in Books, Statistics, Travel, University life with tags CDT, high dimensions, minimaxity, OxWaSP, parallel MCMC, scalable MCMC, statistical theory, University of Warwick on February 27, 2019 by xi'anAnother presentation by our OxWaSP students introduced me to the notion of distributed posteriors, following a 2018 paper by Botond Szabó and Harry van Zanten. Which corresponds to the construction of posteriors when conducting a divide & conquer strategy. The authors show that an adaptation of the prior to the division of the sample is necessary to recover the (minimax) convergence rate obtained in the nondistributed case. This is somewhat annoying, except that the adaptation amounts to take the original prior to the power 1/m, when m is the number of divisions. They further show that when the regularity (parameter) of the model is unknown, the optimal rate cannot be recovered unless stronger assumptions are made on the nonzero parameters of the model.
“First of all, we show that depending on the communication budget, it might be advantageous to group local machines and let different groups work on different aspects of the highdimensional object of interest. Secondly, we show that it is possible to have adaptation in communication restricted distributed settings, i.e. to have datadriven tuning that automatically achieves the correct biasvariance tradeoff.”
I find the paper of considerable interest for scalable MCMC methods, even though the setting may happen to sound too formal, because the study incorporates parallel computing constraints. (Although I did not investigate the more theoretical aspects of the paper.)
IMS workshop [day 3]
Posted in pictures, R, Statistics, Travel, University life with tags Bayesian computation, Birch, delayed simulation, high dimensions, hypocoercivity, IMS, Institute for Mathematical Sciences, Lapland, MCqMC 2018, National University Singapore, nonreversible diffusion, NUS, ODE, partly deterministic processes, probabilistic programming, RaoBlackwellisation, Rennes, Singapore, WangLandau algorithm, workshop on August 30, 2018 by xi'anI made the “capital” mistake of walking across the entire NUS campus this morning, which is quite green and pretty, but which almost enjoys an additional dimension brought by such an intense humidity that one feels having to get around this humidity!, a feature I have managed to completely erase from my memory of my previous visit there. Anyway, nothing of any relevance. oNE talk in the morning was by Markus Eisenbach on tools used by physicists to speed up Monte Carlo methods, like the WangLandau flat histogram, towards computing the partition function, or the distribution of the energy levels, definitely addressing issues close to my interest, but somewhat beyond my reach for using a different language and stress, as often in physics. (I mean, as often in physics talks I attend.) An idea that came out clear to me was to bypass a (flat) histogram target and aim directly at a constant slope cdf for the energy levels. (But got scared away by the Fourier transforms!)
Lawrence Murray then discussed some features of the Birch probabilistic programming language he is currently developing, especially a fairly fascinating concept of delayed sampling, which connects with locallyoptimal proposals and Rao Blackwellisation. Which I plan to get back to later [and hopefully sooner than later!].
In the afternoon, Maria de Iorio gave a talk about the construction of nonparametric priors that create dependence between a sequence of functions, a notion I had not thought of before, with an array of possibilities when using the stick breaking construction of Dirichlet processes.
And Christophe Andrieu gave a very smooth and helpful entry to partly deterministic Markov processes (PDMP) in preparation for talks he is giving next week for the continuation of the workshop at IMS. Starting with the guided random walk of Gustafson (1998), which extended a bit later into the nonreversible paper of Diaconis, Holmes, and Neal (2000). Although I had a vague idea of the contents of these papers, the role of the velocity ν became much clearer. And premonitory of the advances made by the more recent PDMP proposals. There is obviously a continuation with the equally pedagogical talk Christophe gave at MCqMC in Rennes two months [and half the globe] ago, but the focus being somewhat different, it really felt like a new talk [my short term memory may also play some role in this feeling!, as I now remember the discussion of Hilderbrand (2002) for nonreversible processes]. An introduction to the topic I would recommend to anyone interested in this new branch of Monte Carlo simulation! To be followed by the most recently arXived hypocoercivity paper by Christophe and coauthors.
ABCDay [arXivals]
Posted in Books, Statistics, University life with tags ABC, Approximate Bayesian computation, arXiv, Handbook of Approximate Bayesian computation, high dimensions, likelihoodfree methods, Scott Sisson on March 2, 2018 by xi'anA bunch of ABC papers on arXiv yesterday, most of them linked to the incoming Handbook of ABC:


Overview of Approximate Bayesian Computation S. A. Sisson, Y. Fan, M. A. Beaumont

Kernel Recursive ABC: Point Estimation with Intractable Likelihood Takafumi Kajihara, Keisuke Yamazaki, Motonobu Kanagawa, Kenji Fukumizu

Highdimensional ABC D. J. Nott, V. M.H. Ong, Y. Fan, S. A. Sisson
 ABC Samplers Y. Fan, S. A. Sisson
