Archive for Gaussian copula

adaptive copulas for ABC

Posted in Statistics with tags , , , , , , , , on March 20, 2019 by xi'an

A paper on ABC I read on my way back from Cambodia:  Yanzhi Chen and Michael Gutmann arXived an ABC [in Edinburgh] paper on learning the target via Gaussian copulas, to be presented at AISTATS this year (in Okinawa!). Linking post-processing (regression) ABC and sequential ABC. The drawback in the regression approach is that the correction often relies on an homogeneity assumption on the distribution of the noise or residual since this approach only applies a drift to the original simulated sample. Their method is based on two stages, a coarse-grained one where the posterior is approximated by ordinary linear regression ABC. And a fine-grained one, which uses the above coarse Gaussian version as a proposal and returns a Gaussian copula estimate of the posterior. This proposal is somewhat similar to the neural network approach of Papamakarios and Murray (2016). And to the Gaussian copula version of Li et al. (2017). The major difference being the presence of two stages. The new method is compared with other ABC proposals at a fixed simulation cost, which does not account for the construction costs, although they should be relatively negligible. To compare these ABC avatars, the authors use a symmetrised Kullback-Leibler divergence I had not met previously, requiring a massive numerical integration (although this is not an issue for the practical implementation of the method, which only calls for the construction of the neural network(s)). Note also that sequential ABC is only run for two iterations, and also that none of the importance sampling ABC versions of Fearnhead and Prangle (2012) and of Li and Fearnhead (2018) are considered, all versions relying on the same vector of summary statistics with a dimension much larger than the dimension of the parameter. Except in our MA(2) example, where regression does as well. I wonder at the impact of the dimension of the summary statistic on the performances of the neural network, i.e., whether or not it is able to manage the curse of dimensionality by ignoring all but essentially the data  statistics in the optimisation.

computer strategies for complex Bayesian models

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on July 18, 2016 by xi'an

frontThis is the cover page of Marco Banterle‘s thesis, who will defend on Thursday [July 21, 13:00], at a rather quiet time for French universities, which is one reason for advertising it here. The thesis is built around several of Marco’s papers, like delayed acceptance, dimension expansion, and Gaussian copula for graphical models. The defence is open to everyone, so feel free to join if near Paris-Dauphine!

exact, unbiased, what else?!

Posted in Books, Statistics, University life with tags , , , , , , , , on April 13, 2016 by xi'an

Last week, Matias Quiroz, Mattias Villani, and Robert Kohn arXived a paper on exact subsampling MCMC, a paper that contributes to the current literature on approximating MCMC samplers for large datasets, in connection with an earlier paper of Quiroz et al. discussed here last week.

quirozetal.The “exact” in the title is to be understood in the Russian roulette sense. By using Rhee and Glynn debiaising device, the authors achieve an unbiased estimator of the likelihood as in Bardenet et al. (2015). The central tool for the derivation of an unbiased and positive estimator is to find a control variate for each component of the log likelihood that is good enough for the difference between the component and the control to be lower bounded. By the constant a in the screen capture above. When the individual terms d in the product are iid unbiased estimates of the log likelihood difference. And q is the sum of the control variates. Or maybe more accurately of the cheap substitutes to the exact log likelihood components. Thus still of complexity O(n), which makes the application to tall data more difficult to contemplate.

The $64 question is obviously how to produce cheap and efficient control variates that kill the curse of the tall data. (It still irks to resort to this term of control variate, really!) Section 3.2 in the paper suggests clustering the data and building an approximation for each cluster, which seems to imply manipulating the whole dataset at this early stage. At a cost of O(Knd). Furthermore, because finding a correct lower bound a is close to impossible in practice, the authors use a “soft lower bound”, meaning that it is only an approximation and thus that (3.4) above can get negative from time to time, which cancels the validation of the method as a pseudo-marginal approach. The resolution of this difficulty is to resort to the same proxy as in the Russian roulette paper, replacing the unbiased estimator with its absolute value, an answer I already discussed for the Russian roulette paper. An additional step is proposed by Quiroz et al., namely correlating the random numbers between numerator and denominator in their final importance sampling estimator, via a Gaussian copula as in Deligiannidis et al.

This paper made me wonder (idly wonder, mind!) anew how to get rid of the vexing unbiasedness requirement. From a statistical and especially from a Bayesian perspective, unbiasedness is a second order property that cannot be achieved for most transforms of the parameter θ. And that does not keep under reparameterisation. It is thus vexing and perplexing that unbiased is so central to the validation of our Monte Carlo technique and that any divergence from this canon leaves us wandering blindly with no guarantee of ever reaching the target of the simulation experiment…