composition versus inversion

While trying to convey to an OP on X validated why the inversion method was not always the panacea in pseudo-random generation, I took the example of a mixture of K exponential distributions when K is very large, in order to impress (?) upon said OP that solving F(x)=u for such a closed-form cdf F was very costly even when using a state-of-the-art (?) inversion algorithm, like uniroot, since each step involves adding the K terms in the cdf. Selecting the component from the cumulative distribution function on the component proves to be quite fast since using the rather crude

x=rexp(1,lambda[1+sum(runif(1)>wes)])

brings a 100-fold improvement over

Q = function(u) uniroot((function(x) F(x) - u), lower = 0, 
    upper = qexp(.999,rate=min(la)))[1] #numerical tail quantile
x=Q(runif(1))

when K=10⁵, as shown by a benchmark call

         test elapsed
1       compo   0.057
2      Newton  45.736
3     uniroot   5.814

where Newton denotes a simple-minded Newton inversion. I wonder if there is a faster way to select the component in the mixture. Using a while loop starting from the most likely components proves to be much slower. And accept-reject solutions are invariably slow or fail to work with such a large number of components. Devroye’s Bible has a section (XIV.7.5) on simulating sums of variates from an infinite mixture distribution, but, for once,  nothing really helpful. And another section (IV.5) on series methods, where again I could not find a direct connection.

5 Responses to “composition versus inversion”

  1. That makes sense! I imagine that one can contrive something of this form via scale mixtures, e.g. one can write the Laplace distribution as a Gaussian with a randomised variance, but it would usually still be faster to just sample the Laplace RV directly by inversion.

    • In the Laplace cum Gauss case, t would take roughly the same time, presumably. I was rather thinking of an unsuspected series resolution, when one is unaware of the series being a computable and invertible function…

  2. […] article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) […]

  3. I am a bit curious about the question about selecting the component in the mixture – is the question about whether one can do this efficiently via a “pure inversion” approach? As you suggest, if you know the weights of the mixture, then the composition method should be suitable.

    • Yes, it is hard to fathom a counterexample where inverting the cdf is less costly than drawing first the component index… Unless the inverse cdf F⁻¹ of an infinite mixture is available in closed form while the mixture amounts to a series decomposition of the original F, possibly with negative coefficients to make it harder?!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.