Archive for inverse cdf

simulating signed mixtures

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , on February 2, 2024 by xi'an

While simulating from a mixture of standard densities is relatively straightforward, when the component densities are easily simulated, to the point that many simulation methods exploit an intermediary mixture construction to speed up the production of pseudo-random samples from more challenging distributions (see Devroye, 1986), things get surprisingly more complicated when the mixture weights can take negative values. For instance, the naïve solution consisting in first simulating from the associated mixture of positive weight components
and then using an accept-reject step may prove highly inefficient since the overall probability of acceptance

{\displaystyle 1}\Big/{\displaystyle \sum_{k=1}^{P} \omega_k^+}

is the inverse of the sum of the positive weights and hence can be arbitrarily close to zero. The intuition for such inefficiency is that simulating from the positive weight components need not produce values within regions of high probability for the actual distribution

m = \sum_{k=1}^P \omega_k^+ f_k - \sum_{k=1}^N \omega_k^- g_k

since its negative weight components may remove most of the mass under the positive weight components. In other words, the negative weight components do not have a natural latent variable interpretation and the resulting mixture can be anything, as the above graph testifies.

Julien Stoehr (Paris Dauphine) and I started investigating this interesting challenge when the Master students who had been exposed to said challenge could not dent it in any meaningful way. We have now arXived a specific algorithm that proves superior to the naïve accept-reject algorithm, but also to the numerical cdf inversion (which happens to be available in this setting). Compared with the naïve version, we construct an alternative accept-reject scheme based on pairing positive and negative components as well as possible, partitioning the real line, and finding tighter upper and lower bounds on positive and negative components, respectively, towards yielding a higher acceptance rate on average. Designing a random generator of signed mixtures with enough variability and representativity proved a challenge in itself!

combining normalizing flows and QMC

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , , , on January 23, 2024 by xi'an

My PhD student Charly Andral [presented at the mostly Monte Carlo seminar and] arXived a new preprint yesterday, on training a normalizing flow network as an importance sampler (as in Gabrié et al.) or an independent Metropolis proposal, and exploiting its invertibility to call quasi-Monte Carlo low discrepancy sequences to boost its efficiency. (Training the flow is not covered by the paper.) This extends the recent study of He et al. (which was presented at MCM 2023 in Paris) to the normalising flow setting. In the current experiments, the randomized QMC samples are computed using the SciPy package (Roy et al. 2023), where the Sobol’ sequence is based on Joe and Kuo (2008) and on Matouˇsek (1998) for the scrambling, and where the Halton sequence is based on Owen (2017). (No pure QMC was harmed in the process!) The flows are constructed using the package FlowMC. As expected the QMC version brings a significant improvement in the quality of the Monte Carlo approximations, for equivalent computing times, with however a rapid decrease in the efficiency as the dimension of the targetted distribution increases. On the other hand, the architecture of the flow demonstrates little relevance. And the type of  RQMC sequence makes a difference, the advantage apparently going to a scrambled Sobol’ sequence.

a versatile alternative to ABC

Posted in Books, Statistics with tags , , , , , , , , , on July 25, 2023 by xi'an

“We introduce the Fixed Landscape Inference MethOd, a new likelihood-free inference method for continuous state-space stochastic models. It applies deterministic gradient-based optimization algorithms to obtain a point estimate of the parameters, minimizing the difference between the data and some simulations according to some prescribed summary statistics. In this sense, it is analogous to Approximate Bayesian Computation (ABC). Like ABC, it can also provide an approximation of the distribution of the parameters.”

I quickly read this arXival by Monard et al. that is presented as an alternative to ABC, while outside a Bayesian setup. The central concept is that a deterministic gradient descent provides an optimal parameter value when replacing the likelihood with a distance between the observed data and simulated synthetic data indexed by the current value of the parameter (in the descent). In order to operate the descent the synthetic data is assumed to be available as a deterministic transform of the parameter value and of a vector of basic random objects, eg Uniforms. In order to make the target function differentiable, the above Uniform vector is fixed for the entire gradient descent. A puzzling aspect of the paper is that it seems to compare the (empirical) distribution of the resulting estimator with a posterior distribution, unless the comparison is with the (empirical) distribution of the Bayes estimators. The variability due to the choice of the fixed vector of basic random objects does not seem to be taken into account either, apparently. Furthermore, the method is presented as able to handle several models at once, which I find difficult to fathom as (a) the random vectors behind each model necessarily vary and (b) there is no apparent penalisation for complexity.

simulating from the joint cdf

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , on July 13, 2022 by xi'an

An X validated question (what else?!) brought back (to me) the question of handling a bivariate cdf for simulation purposes. In the specific case of a copula when thus marginals were (well-)known…. And led me to an erroneous chain of thought, fortunately rescued by Robin Ryder! When the marginal distributions are set, the simulation setup is indeed equivalent to a joint Uniform simulation from a copula

\mathbb P[U_1\leq u_1,U_2\leq u_2,\dots,U_d\leq u_d]=C(u_1,u_2,\dots,u_d)

In specific cases, as for instance the obvious example of Gaussian copulas, there exist customised simulation algorithms. Looking for more generic solutions, I turn to the Bible, where Chapter XI[an], has two entire sections XI.3.2. and XI.3.3 on the topic (even though Luc Devroye does not use the term copula there despite them being introduced in 1959 by A, Sklar, in response to a query of M. Fréchet). In addition to a study of copulas, both sections contain many specific solutions (as for instance in the [unnumbered] Table on page 585) but I found no generic simulation method. My [non-selected] answer to the question was thus to propose standard solutions such as finding one conditional since the marginals are Uniform. Which depends on the tractability of the derivatives of C(·,·).

However, being dissatisfied with this bland answer, I thought further about the problem and came up with a fallacious scheme, namely to first simulate the value p of C(U,V) by drawing a Uniform, and second simulate (U,V) conditional on C(U,V)=p. Going as far as running an R code on a simple copula, as shown above. Fallacious reasoning since (as I knew already!!!), C(U,V) is not uniformly distributed! But has instead a case-dependent distribution… As a (connected) aside, I wonder if the generator attached with Archimedean copulas has any magical feature that help with the generation of the associated copula.

analogy between mean and median [X validated]

Posted in Books, Kids, Statistics with tags , , , on February 6, 2022 by xi'an

A recent question on X validated was looking rather naïvely for the median being equal to the mean, as earlier questions there, but this made me realise a certain symmetry in the two notions, namely that the median ς is solution of the balance condition

\int_{-\infty}^\varsigma f(d)\text{d}x = \int^{+\infty}_\varsigma f(xd)\text{d}x

namely the areas under the density are equal, while the mean μ is solution of a parallel balance condition

\int_{-\infty}^\mu F(d)\text{d}x = \int^{+\infty}_\mu (1-F)(x)\text{d}x

which is an equality of the same kind, i.e., of areas under the cdf or its complement…. Courtesy of the general representation

\mathbb E[X] = \int_0^{\infty} (1-F)(x) \text dx - \int_{-\infty}^0 F(x) \text dx

The general problem of having the mean and median equal is not particularly interesting as it is a matter of parameterisation. For instance, using the cdf transform turns the random variate X into a Uniform F(X), with mean and median equal to ½.