Archive for Gibbs sampling

too many marginals

Posted in Kids, Statistics with tags , , , , , , , on February 3, 2020 by xi'an

This week, the CEREMADE coffee room puzzle was about finding a joint distribution for (X,Y) such that (marginally) X and Y are both U(0,1), while X+Y is U(½,1+½). Beyond the peculiarity of the question, there is a larger scale problem, as to how many (if any) compatible marginals h¹(X,Y), h²(X,Y), h³(X,Y), …, need one constrains the distribution to reconstruct the joint. And wondering if any Gibbs-like scheme is available to simulate the joint.

MCMC, with common misunderstandings

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , , , , on January 27, 2020 by xi'an

As I was asked to write a chapter on MCMC methods for an incoming Handbook of Computational Statistics and Data Science, published by Wiley, rather than cautiously declining!, I decided to recycle the answers I wrote on X validated to what I considered to be the most characteristic misunderstandings about MCMC and other computing methods, using as background the introduction produced by Wu Changye in his PhD thesis. Waiting for the opinion of the editors of the Handbook on this Q&A style. The outcome is certainly lighter than other recent surveys like the one we wrote with Peter Green, Krys Latuszinski, and Marcelo Pereyra, for Statistics and Computing, or the one with Victor Elvira, Nick Tawn, and Changye Wu.

Scott Sisson’s ABC seminar in Paris [All about that Bayes]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on January 20, 2020 by xi'an

On the “All about that Bayes” seminar tomorrow (Tuesday 21 at 3p.m., room 42, AgroParisTech, 16 rue Claude Bernard, Paris 5ième), Scott Sisson, School of Mathematics and Statistics at UNSW, and visiting Paris-Dauphine this month, will give a talk on

Approximate posteriors and data for Bayesian inference

Abstract
For various reasons, including large datasets and complex models, approximate inference is becoming increasingly common. In this talk I will provide three vignettes of recent work. These cover a) approximate Bayesian computation for Gaussian process density estimation, b) likelihood-free Gibbs sampling, and c) MCMC for approximate (rounded) data.

stochastic magnetic bits, simulated annealing and Gibbs sampling

Posted in Statistics with tags , , , , , , , , , on October 17, 2019 by xi'an

A paper by Borders et al. in the 19 September issue of Nature offers an interesting mix of computing and electronics and optimisation. With two preparatory tribunes! One [rather overdone] on Feynman’s quest. As a possible alternative to quantum computers for creating probabilistic bits. And making machine learning (as an optimisation program) more efficient. And another one explaining more clearly what is in the paper. As well as the practical advantage of the approach over quantum computing. As for the paper itself, the part I understood about factorising an integer F via minimising the squared difference between a product of two integers and F and using simulated annealing sounded rather easy, while the part I did not about constructing a semi-conductor implementing this stochastic search sounded too technical (especially in the métro during rush hour). Even after checking the on-line supplementary material. Interestingly, the paper claims for higher efficiency thanks to asynchronicity than a regular Gibbs simulation of Boltzman machines, quoting Roberts and Sahu (1997) without further explanation and possibly out of context (as the latter is not concerned with optimisation).

ABC in Clermont-Ferrand

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on September 20, 2019 by xi'an

Today I am taking part in a one-day workshop at the Université of Clermont Auvergne on ABC. With applications to cosmostatistics, along with Martin Kilbinger [with whom I worked on PMC schemes], Florent Leclerc and Grégoire Aufort. This should prove a most exciting day! (With not enough time to run up Puy de Dôme in the morning, though.)

the three i’s of poverty

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , on September 15, 2019 by xi'an

Today I made a “quick” (10h door to door!) round trip visit to Marseille (by train) to take part in the PhD thesis defense (committee) of Edwin Fourrier-Nicolaï, which title was Poverty, inequality and redistribution: an econometric approach. While this was mainly a thesis in economics, meaning defending some theory on inequalities based on East German data, there were Bayesian components in the thesis that justified (to some extent!) my presence in the jury. Especially around mixture estimation by Gibbs sampling. (On which I started working almost exactly 30 years ago, when I joined Paris 6 and met  Gilles Celeux and Jean Diebolt.) One intriguing [for me] question stemmed from this defense, namely the notion of a Bayesian estimation of a three i’s of poverty (TIP) curve. The three i’s stand for incidence, intensity, and inequality, as, introduced in Jenkins and Lambert (1997), this curve measure the average income loss from the poverty level for the 100p% lower incomes, when p varies between 0 and 1. It thus depends on the distribution F of the incomes and when using a mixture distribution its computation requires a numerical cdf inversion to determine the income p-th quantile. A related question is thus on how to define a Bayesian estimate of the TIP curve. Using an average over the values of an MCMC sample does not sound absolutely satisfactory since the upper bound in the integral varies for each realisation of the parameter. The use of another estimate would however require a specific loss function, an issue not discussed in the thesis.

efficient MCMC sampling

Posted in Statistics with tags , , , on June 24, 2019 by xi'an

Maxime Vono, Daniel Paulin and Arnaud Doucet recently arXived a paper about a regularisation technique that allows for efficient sampling from a complex posterior which potential function factorises as a large sum of transforms of linear projections of the parameter θ

U(\theta)=\sum_i U_i(A_i\theta)

The central idea in the paper [which was new to me] is to introduce auxiliary variates for the different terms in the sum, replacing the projections in the transforms, with an additional regularisation forcing these auxiliary variates to be as close as possible from the corresponding projection

U(\theta,\mathbf z)=\sum_i U_i(z_i)+\varrho^{-1}||z_i-A_i\theta||^2

This is only an approximation to the true target but it enjoys the possibility to run a massive Gibbs sampler in quite a reduced dimension. As the variance ρ of the regularisation term goes to zero the marginal posterior on the parameter θ converges to the true posterior. The authors manage to achieve precise convergence rates both in total variation and in Wasserstein distance.

From a practical point of view, only judging from the logistic example, it is hard to fathom how much this approach improves upon other approaches (provided they still apply) as the impact of the value of ρ should be assessed on top of the convergence of the high-dimensional Gibbs sampler. Or is there an annealing version in the pipe-line? While parallelisation is a major argument, it also seems that the Gibbs sampler need a central monitoring for each new simulation of θ. Unless some asynchronous version can be implemented.