Archive for copulas

impressions from EcoSta2017 [guest post]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on July 6, 2017 by xi'an

[This is a guest post on the recent EcoSta2017 (Econometrics and Statistics) conference in Hong Kong, contributed by Chris Drovandi from QUT, Brisbane.]

There were (at least) two sessions on Bayesian Computation at the recent EcoSta (Econometrics and Statistics) 2017 conference in Hong Kong. Below is my review of them. My overall impression of the conference is that there were lots of interesting talks, albeit a lot in financial time series, not my area. Even so I managed to pick up a few ideas/concepts that could be useful in my research. One criticism I had was that there were too many sessions in parallel, which made choosing quite difficult and some sessions very poorly attended. Another criticism of many participants I spoke to was that the location of the conference was relatively far from the city area.

In the first session (chaired by Robert Kohn), Minh-Ngoc Tran spoke about this paper on Bayesian estimation of high-dimensional Copula models with mixed discrete/continuous margins. Copula models with all continuous margins are relatively easy to deal with, but when the margins are discrete or mixed there are issues with computing the likelihood. The main idea of the paper is to re-write the intractable likelihood as an integral over a hypercube of ≤J dimensions (where J is the number of variables), which can then be estimated unbiasedly (with variance reduction by using randomised quasi-MC numbers). The paper develops advanced (correlated) pseudo-marginal and variational Bayes methods for inference.

In the following talk, Chris Carter spoke about different types of pseudo-marginal methods, particle marginal Metropolis-Hastings and particle Gibbs for state space models. Chris suggests that a combination of these methods into a single algorithm can further improve mixing. Continue reading

postprocessing for ABC

Posted in Books, Statistics with tags , , , , on June 1, 2017 by xi'an

Two weeks ago, G.S. Rodrigues, Dennis Prangle and Scott Sisson have recently arXived a paper on recalibrating ABC output to make it correctly calibrated (in the frequentist sense). As in earlier papers, it takes advantage of the fact that the tail posterior probability should be uniformly distributed at the true value of the [simulated] parameter behind the [simulated] data. And as in Prangle et al. (2014), relies on a copula representation. The main notion is that marginals posteriors can be reasonably approximated by non-parametric kernel estimators, which means that an F⁰oF⁻¹ transform can be applied to an ABC reference table in a fully non-parametric extension of Beaumont et al.  (2002). Besides the issue that F is an approximation, I wonder about the computing cost of this approach, given that computing the post-processing transforms comes at a cost of O(pT²) when p is the dimension of the parameter and T the size of the ABC learning set… One question that came to me while discussing the paper with Jean-Michel Marin is why one would use F⁻¹(θ¹|s) instead of directly a uniform U(0,1) since in theory this should be a uniform U(0,1).

MCqMC 2016 [#2]

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , on August 17, 2016 by xi'an

In her plenary talk this morning, Christine Lemieux discussed connections between quasi-Monte Carlo and copulas, covering a question I have been considering for a while. Namely, when provided with a (multivariate) joint cdf F, is there a generic way to invert a vector of uniforms [or quasi-uniforms] into a simulation from F? For Archimedian copulas (as we always can get back to copulas), there is a resolution by the Marshall-Olkin representation,  but this puts a restriction on the distributions F that can be considered. The session on synthetic likelihoods [as introduced by Simon Wood in 2010] put together by Scott Sisson was completely focussed on using normal approximations for the distribution of the vector of summary statistics, rather than the standard ABC non-parametric approximation. While there is a clear (?) advantage in using a normal pseudo-likelihood, since it stabilises with much less simulations than a non-parametric version, I find it difficult to compare both approaches, as they lead to different posterior distributions. In particular, I wonder at the impact of the dimension of the summary statistics on the approximation, in the sense that it is less and less likely that the joint is normal as this dimension increases. Whether this is damaging for the resulting inference is another issue, possibly handled by a supplementary ABC step that would take the first-step estimate as summary statistic. (As a side remark, I am intrigued at everyone being so concerned with unbiasedness of methods that are approximations with no assessment of the amount of approximation!) The last session of the day was about multimodality and MCMC solutions, with talks by Hyungsuk Tak, Pierre Jacob and Babak Shababa, plus mine. Hunsuk presented the RAM algorithm I discussed earlier under the title of “love-hate” algorithm, which was a kind reference to my post! (I remain puzzled by the ability of the algorithm to jump to another mode, given that the intermediary step aims at a low or even zero probability region with an infinite mass target.) And Pierre talked about using SMC for Wang-Landau algorithms, with a twist to the classical stochastic optimisation schedule that preserves convergence. And a terrific illustration on a distribution inspired from the Golden Gate Bridge that reminded me of my recent crossing! The discussion around my folded Markov chain talk focussed on the extension of the partition to more than two sets, the difficulty being in generating automated projections, with comments about connections with computer graphic tools. (Too bad that the parallel session saw talks by Mark Huber and Rémi Bardenet that I missed! Enjoying a terrific Burmese dinner with Rémi, Pierre and other friends also meant I could not post this entry on time for the customary 00:16. Not that it matters in the least…)

correlation matrices on copulas

Posted in R, Statistics, University life with tags , , , , on July 4, 2016 by xi'an

Following my post of yesterday about the missing condition in Lynch’s R code, Gérard Letac sent me a paper he recently wrote with Luc Devroye on correlation matrices and copulas. Paper written for the memorial volume in honour of Marc Yor. It considers the neat problem of the existence of a copula (on [0,1]x…x[0,1]) associated with a given correlation matrix R. And establishes this existence up to dimension n=9. The proof is based on the consideration of the extreme points of the set of correlation matrices. The authors conjecture the existence of (10,10) correlation matrices that cannot be a correlation matrix for a copula. The paper also contains a result that answers my (idle) puzzling of many years, namely on how to set the correlation matrix of a Gaussian copula to achieve a given correlation matrix R for the copula. More precisely, the paper links the [correlation] matrix R of X~N(0,R) with the [correlation] matrix R⁰ of Φ(X) by

r^0_{ij}=\frac{6}{\pi}\arcsin\{r_{ij}/2\}

A side consequence of this result is that there exist correlation matrices of copulas that cannot be associated with Gaussian copulas. Like

R=\left[\begin{matrix} 1 &-1/2 &-1/2\\-1/2 &1 &-1/2\\-1/2 &-1/2 &1 \end{matrix}\right]

Statistics month in Marseilles (CIRM)

Posted in Books, Kids, Mountains, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on June 24, 2015 by xi'an

Calanque de Morgiou, Marseille, July 7, 2010Next February, the fabulous Centre International de Recherche en Mathématiques (CIRM) in Marseilles, France, will hold a Statistics month, with the following programme over five weeks

Each week will see minicourses of a few hours (2-3) and advanced talks, leaving time for interactions and collaborations. (I will give one of those minicourses on Bayesian foundations.) The scientific organisers of the B’ week are Gilles Celeux and Nicolas Chopin.

The CIRM is a wonderful meeting place, in the mountains between Marseilles and Cassis, with many trails to walk and run, and hundreds of fantastic climbing routes in the Calanques at all levels. (In February, the sea is too cold to contemplate swimming. The good side is that it is not too warm to climb and the risk of bush fire is very low!) We stayed there with Jean-Michel Marin a few years ago when preparing Bayesian Essentials. The maths and stats library is well-provided, with permanent access for quiet working sessions. This is the French version of the equally fantastic German Mathematik Forschungsinstitut Oberwolfach. There will be financial support available from the supporting societies and research bodies, at least for young participants and the costs if any are low, for excellent food and excellent lodging. Definitely not a scam conference!

simulating correlated random variables [cont’ed]

Posted in Books, Kids, Statistics with tags , , , , on May 28, 2015 by xi'an

zerocorFollowing a recent post on the topic, and comments ‘Og’s readers kindly provided on that post, the picture is not as clear as I wished it was… Indeed, on the one hand, non-parametric measures of correlation based on ranks are, as pointed out by Clara Grazian and others, invariant under monotonic transforms and hence producing a Gaussian pair or a Uniform pair with the intended rank correlation is sufficient to return a correlated sample for any pair of marginal distributions by the (monotonic) inverse cdf transform.  On the other hand, if correlation is understood as Pearson linear correlation, (a) it is not always defined and (b) there does not seem to be a generic approach to simulate from an arbitrary triplet (F,G,ρ) [assuming the three entries are compatible]. When Kees pointed out Pascal van Kooten‘s solution by permutation, I thought this was a terrific resolution, but after thinking about it a wee bit more, I am afraid it is only an approximation, i.e., a way to return a bivariate sample with a given empirical correlation. Not the theoretical correlation. Obviously, when the sample is very large, this comes as a good approximation. But when facing a request to simulate a single pair (X,Y), this gets inefficient [and still approximate].

Now, if we aim at exact simulation from a bivariate distribution with the arbitrary triplet (F,G,ρ), why can’t we find a generic method?! I think one fundamental if obvious reason is that the question is just ill-posed. Indeed, there are many ways of defining a joint distribution with marginals F and G and with (linear) correlation ρ. One for each copula. The joint could thus be associated with a Gaussian copula, i.e., (X,Y)=(F⁻¹(Φ(A)),G⁻¹(Φ(B))) when (A,B) is a standardised bivariate normal with the proper correlation ρ’. Or it can be associated with the Archimedian copula

C(u; v) = (u + v − 1)-1/θ,

with θ>0 defined by a (linear) correlation of ρ. Or yet with any other copula… Were the joint distribution perfectly well-defined, it would then mean that ρ’ or θ (or whatever natural parameter is used for that copula) do perfectly parametrise this distribution instead of the correlation coefficient ρ. All that remains then is to simulate directly from the copula, maybe a theme for a future post…

extending ABC to high dimensions via Gaussian copula

Posted in Books, pictures, Statistics, Travel, Uncategorized, University life with tags , , , on April 28, 2015 by xi'an

plane2Li, Nott, Fan, and Sisson arXived last week a new paper on ABC methodology that I read on my way to Warwick this morning. The central idea in the paper is (i) to estimate marginal posterior densities for the components of the model parameter by non-parametric means; and (ii) to consider all pairs of components to deduce the correlation matrix R of the Gaussian (pdf) transform of the pairwise rank statistic. From those two low-dimensional estimates, the authors derive a joint Gaussian-copula distribution by using inverse  pdf transforms and the correlation matrix R, to end up with a meta-Gaussian representation

f(\theta)=\dfrac{1}{|R|^{1/2}}\exp\{\eta^\prime(I-R^{-1})\eta/2\}\prod_{i=1}^p g_i(\theta_i)

where the η’s are the Gaussian transforms of the inverse-cdf transforms of the θ’s,that is,

\eta_i=\Phi^{-1}(G_i(\theta_i))

Or rather

\eta_i=\Phi^{-1}(\hat{G}_i(\theta_i))

given that the g’s are estimated.

This is obviously an approximation of the joint in that, even in the most favourable case when the g’s are perfectly estimated, and thus the components perfectly Gaussian, the joint is not necessarily Gaussian… But it sounds quite interesting, provided the cost of running all those transforms is not overwhelming. For instance, if the g’s are kernel density estimators, they involve sums of possibly a large number of terms.

One thing that bothers me in the approach, albeit mostly at a conceptual level for I realise the practical appeal is the use of different summary statistics for approximating different uni- and bi-dimensional marginals. This makes for an incoherent joint distribution, again at a conceptual level as I do not see immediate practical consequences… Those local summaries also have to be identified, component by component, which adds another level of computational cost to the approach, even when using a semi-automatic approach as in Fernhead and Prangle (2012). Although the whole algorithm relies on a single reference table.

The examples in the paper are (i) the banana shaped “Gaussian” distribution of Haario et al. (1999) that we used in our PMC papers, with a twist; and (ii) a g-and-k quantile distribution. The twist in the banana (!) is that the banana distribution is the prior associated with the mean of a Gaussian observation. In that case, the meta-Gaussian representation seems to hold almost perfectly, even in p=50 dimensions. (If I remember correctly, the hard part in analysing the banana distribution was reaching the tails, which are extremely elongated in at least one direction.) For the g-and-k quantile distribution, the same holds, even for a regular ABC. What seems to be of further interest would be to exhibit examples where the meta-Gaussian is clearly an approximation. If such cases exist.