## simulating from the joint cdf

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , on July 13, 2022 by xi'an

An X validated question (what else?!) brought back (to me) the question of handling a bivariate cdf for simulation purposes. In the specific case of a copula when thus marginals were (well-)known…. And led me to an erroneous chain of thought, fortunately rescued by Robin Ryder! When the marginal distributions are set, the simulation setup is indeed equivalent to a joint Uniform simulation from a copula

$\mathbb P[U_1\leq u_1,U_2\leq u_2,\dots,U_d\leq u_d]=C(u_1,u_2,\dots,u_d)$

In specific cases, as for instance the obvious example of Gaussian copulas, there exist customised simulation algorithms. Looking for more generic solutions, I turn to the Bible, where Chapter XI[an], has two entire sections XI.3.2. and XI.3.3 on the topic (even though Luc Devroye does not use the term copula there despite them being introduced in 1959 by A, Sklar, in response to a query of M. Fréchet). In addition to a study of copulas, both sections contain many specific solutions (as for instance in the [unnumbered] Table on page 585) but I found no generic simulation method. My [non-selected] answer to the question was thus to propose standard solutions such as finding one conditional since the marginals are Uniform. Which depends on the tractability of the derivatives of C(·,·).

However, being dissatisfied with this bland answer, I thought further about the problem and came up with a fallacious scheme, namely to first simulate the value p of C(U,V) by drawing a Uniform, and second simulate (U,V) conditional on C(U,V)=p. Going as far as running an R code on a simple copula, as shown above. Fallacious reasoning since (as I knew already!!!), C(U,V) is not uniformly distributed! But has instead a case-dependent distribution… As a (connected) aside, I wonder if the generator attached with Archimedean copulas has any magical feature that help with the generation of the associated copula.

## asymptotically exact inference in likelihood-free models [a reply from the authors]

Posted in R, Statistics with tags , , , , , , , , , , , , , , , , , on December 1, 2016 by xi'an

[Following my post of lastTuesday, Matt Graham commented on the paper with force détails. Here are those comments. A nicer HTML version of the Markdown reply below is also available on Github.]

Thanks for the comments on the paper!

A few additional replies to augment what Amos wrote:

This however sounds somewhat intense in that it involves a quasi-Newton resolution at each step.

The method is definitely computationally expensive. If the constraint function is of the form of a function from an M-dimensional space to an N-dimensional space, with MN, for large N the dominant costs at each timestep are usually the constraint Jacobian (c/u) evaluation (with reverse-mode automatic differentiation this can be evaluated at a cost of O(N) generator / constraint evaluations) and Cholesky decomposition of the Jacobian product (c/u)(c/u) with O(N³) cost (though in many cases e.g. i.i.d. or Markovian simulated data, structure in the generator Jacobian can be exploited to give a significantly reduced cost). Each inner Quasi-Newton update involves a pair of triangular solve operations which have a O(N²) cost, two matrix-vector multiplications with O(MN) cost, and a single constraint / generator function evaluation; the number of Quasi-Newton updates required for convergence in the numerical experiments tended to be much less than N hence the Quasi-Newton iteration tended not to be the main cost.

The high computation cost per update is traded off however with often being able to make much larger proposed moves in high-dimensional state spaces with a high chance of acceptance compared to ABC MCMC approaches. Even in the relatively small Lotka-Volterra example we provide which has an input dimension of 104 (four inputs which map to ‘parameters’, and 100 inputs which map to ‘noise’ variables), the ABC MCMC chains using the coarse ABC kernel radius ϵ=100 with comparably very cheap updates were significantly less efficient in terms of effective sample size / computation time than the proposed constrained HMC approach. This was in large part due to the elliptical slice sampling updates in the ABC MCMC chains generally collapsing down to very small moves even for this relatively coarse ϵ. Performance was even worse using non-adaptive ABC MCMC methods and for smaller ϵ, and for higher input dimensions (e.g. using a longer sequence with correspondingly more random inputs) the comparison becomes even more favourable for the constrained HMC approach. Continue reading