**T**his week, the CEREMADE coffee room puzzle was about finding a joint distribution for (X,Y) such that (marginally) X and Y are both U(0,1), while X+Y is U(½,1+½). Beyond the peculiarity of the question, there is a larger scale problem, as to how many (if any) compatible marginals h¹(X,Y), h²(X,Y), h³(X,Y), …, need one constrains the distribution to reconstruct the joint. And wondering if any Gibbs-like scheme is available to simulate the joint.

## Archive for compatible conditional distributions

## too many marginals

Posted in Kids, Statistics with tags blackboard, CEREMADE, compatible conditional distributions, Gibbs sampling, joint distribution, marginal density, mathematical puzzle, Université Paris Dauphine on February 3, 2020 by xi'an## A precursor of ABC-Gibbs

Posted in Books, R, Statistics with tags ABC, ABC-Gibbs, compatible conditional distributions, Genetics, Gibbs sampler, high dimensions, incoherent inference, incompatible conditionals, insufficiency, likelihood-free methods, sufficient statistics on June 7, 2019 by xi'an**F**ollowing our arXival of ABC-Gibbs, Dennis Prangle pointed out to us a 2016 paper by Athanasios Kousathanas, Christoph Leuenberger, Jonas Helfer, Mathieu Quinodoz, Matthieu Foll, and Daniel Wegmann, Likelihood-Free Inference in High-Dimensional Model, published in Genetics, Vol. 203, 893–904 in June 2016. This paper contains a version of ABC Gibbs where parameters are sequentially simulated from conditionals that depend on the data only through small dimension conditionally sufficient statistics. I had actually blogged about this paper in 2015 but since then completely forgotten about it. (The comments I had made at the time still hold, already pertaining to the coherence or lack thereof of the sampler. I had also forgotten I had run an experiment of an exact Gibbs sampler with incoherent conditionals, which then seemed to converge to something, if not the exact posterior.)

All ABC algorithms, including ABC-PaSS introduced here, require that statistics are sufficient for estimating the parameters of a given model. As mentioned above, parameter-wise sufficient statistics as required by ABC-PaSS are trivial to find for distributions of the exponential family. Since many population genetics models do not follow such distributions, sufficient statistics are known for the most simple models only. For more realistic models involving multiple populations or population size changes, only approximately-sufficient statistics can be found.

While Gibbs sampling is not mentioned in the paper, this is indeed a form of ABC-Gibbs, with the advantage of not facing convergence issues thanks to the sufficiency. The drawback being that this setting is restricted to exponential families and hence difficult to extrapolate to non-exponential distributions, as using almost-sufficient (or not) summary statistics leads to incompatible conditionals and thus jeopardise the convergence of the sampler. When thinking a wee bit more about the case treated by Kousathanas et al., I am actually uncertain about the validation of the sampler. When tolerance is equal to zero, this is not an issue as it reproduces the regular Gibbs sampler. Otherwise, each conditional ABC step amounts to introducing an auxiliary variable represented by the simulated summary statistic. Since the distribution of this summary statistic depends on more than the parameter for which it is sufficient, in general, it should also appear in the conditional distribution of other parameters. At least from this Gibbs perspective, it thus relies on incompatible conditionals, which makes the conditions proposed in our own paper the more relevant.

## ABC with Gibbs steps

Posted in Statistics with tags ABC, ABC-Gibbs, Approximate Bayesian computation, Bayesian inference, bois de Boulogne, compatible conditional distributions, contraction, convergence, ergodicity, France, Gibbs sampler, hierarchical Bayesian modelling, incompatible conditionals, La Défense, Paris, stationarity, tolerance, Université Paris Dauphine on June 3, 2019 by xi'an**W**ith Grégoire Clarté, Robin Ryder and Julien Stoehr, all from Paris-Dauphine, we have just arXived a paper on the specifics of ABC-Gibbs, which is a version of ABC where the generic ABC accept-reject step is replaced by a sequence of n conditional ABC accept-reject steps, each aiming at an ABC version of a conditional distribution extracted from the joint and intractable target. Hence an ABC version of the standard Gibbs sampler. What makes it so special is that each conditional can (and should) be conditioning on a different statistic in order to decrease the dimension of this statistic, ideally down to the dimension of the corresponding component of the parameter. This successfully bypasses the curse of dimensionality but immediately meets with two difficulties. The first one is that the resulting sequence of conditionals is not coherent, since it is not a Gibbs sampler on the ABC target. The conditionals are thus incompatible and therefore convergence of the associated Markov chain becomes an issue. We produce sufficient conditions for the Gibbs sampler to converge to a stationary distribution using incompatible conditionals. The second problem is then that, provided it exists, the limiting and also intractable distribution does not enjoy a Bayesian interpretation, hence may fail to be justified from an inferential viewpoint. We however succeed in producing a version of ABC-Gibbs in a hierarchical model where the limiting distribution can be explicited and even better can be weighted towards recovering the original target. (At least with limiting zero tolerance.)

## “more Bayesian” GANs

Posted in Books, Statistics with tags Bayesian GANs, compatible conditional distributions, GANs, MCMC convergence, pseudo-likelihood on December 21, 2018 by xi'an**O**n X validated, I got pointed to this recent paper by He, Wang, Lee and Tiang, that proposes a new form of Bayesian GAN. Although I do not see it as really Bayesian, as explained below.

“[The]existing Bayesian method (Saatchi & Wilson, 2017) may lead to incompatible conditionals, which suggest that the underlying joint distribution actually does not exist.”

## Bayesian GANs [#2]

Posted in Books, pictures, R, Statistics with tags ABC in Edinburgh, Bayesian GANs, compatible conditional distributions, Edinburgh, GANs, generative adversarial networks, ISBA 2018, joint posterior, MCMC convergence, Metropolis-within-Gibbs algorithm, Monte Carlo Statistical Methods, normal model, University of Edinburgh on June 27, 2018 by xi'an**A**s an illustration of the lack of convergence of the Gibbs sampler applied to the two “conditionals” defined in the Bayesian GANs paper discussed yesterday, I took the simplest possible example of a Normal mean generative model (one parameter) with a logistic discriminator (one parameter) and implemented the scheme (during an ISBA 2018 session). With flat priors on both parameters. And a Normal random walk as Metropolis-Hastings proposal. As expected, since there is no stationary distribution associated with the Markov chain, simulated chains do not exhibit a stationary pattern,

And they eventually reach an overflow error or a trapping state as the log-likelihood gets approximately to zero (red curve).

Too bad I missed the talk by Shakir Mohammed yesterday, being stuck on the Edinburgh by-pass at rush hour!, as I would have loved to hear his views about this rather essential issue…

## likelihood-free inference in high-dimensional models

Posted in Books, R, Statistics, University life with tags ABC, ABC-Gibbs, compatible conditional distributions, convergence of Gibbs samplers, curse of dimensionality, exact ABC, Gibbs sampling, median, median absolute deviation, R on September 1, 2015 by xi'an

“…for a general linear model (GLM), a single linear function is a sufficient statistic for each associated parameter…”

The recently arXived paper “Likelihood-free inference in high-dimensional models“, by Kousathanas et al. (July 2015), proposes an ABC resolution of the dimensionality curse [when the dimension of the parameter and of the corresponding summary statistics] by turning Gibbs-like and by using a component-by-component ABC-MCMC update that allows for low dimensional statistics. In the (rare) event there exists a conditional sufficient statistic for each component of the parameter vector, the approach is just as justified as when using a generic ABC-Gibbs method based on the whole data. Otherwise, that is, when using a non-sufficient estimator of the corresponding component (as, e.g., in a generalised [not general!] linear model), the approach is less coherent as there is no joint target associated with the Gibbs moves. One may therefore wonder at the convergence properties of the resulting algorithm. The only safe case [in dimension 2] is when one of the restricted conditionals does not depend on the other parameter. Note also that each Gibbs step a priori requires the simulation of a new pseudo-dataset, which may be a major imposition on computing time. And that setting the tolerance for each parameter is a delicate calibration issue because in principle the tolerance should depend on the other component values. Continue reading