composition versus inversion
While trying to convey to an OP on X validated why the inversion method was not always the panacea in pseudo-random generation, I took the example of a mixture of K exponential distributions when K is very large, in order to impress (?) upon said OP that solving F(x)=u for such a closed-form cdf F was very costly even when using a state-of-the-art (?) inversion algorithm, like uniroot, since each step involves adding the K terms in the cdf. Selecting the component from the cumulative distribution function on the component proves to be quite fast since using the rather crude
x=rexp(1,lambda[1+sum(runif(1)>wes)])
brings a 100-fold improvement over
Q = function(u) uniroot((function(x) F(x) - u), lower = 0, upper = qexp(.999,rate=min(la)))[1] #numerical tail quantile x=Q(runif(1))
when K=10⁵, as shown by a benchmark call
test elapsed 1 compo 0.057 2 Newton 45.736 3 uniroot 5.814
where Newton denotes a simple-minded Newton inversion. I wonder if there is a faster way to select the component in the mixture. Using a while loop starting from the most likely components proves to be much slower. And accept-reject solutions are invariably slow or fail to work with such a large number of components. Devroye’s Bible has a section (XIV.7.5) on simulating sums of variates from an infinite mixture distribution, but, for once, nothing really helpful. And another section (IV.5) on series methods, where again I could not find a direct connection.
April 1, 2021 at 9:57 am
That makes sense! I imagine that one can contrive something of this form via scale mixtures, e.g. one can write the Laplace distribution as a Gaussian with a randomised variance, but it would usually still be faster to just sample the Laplace RV directly by inversion.
April 1, 2021 at 2:36 pm
In the Laplace cum Gauss case, t would take roughly the same time, presumably. I was rather thinking of an unsuspected series resolution, when one is unaware of the series being a computable and invertible function…
March 31, 2021 at 10:38 pm
[…] article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) […]
March 31, 2021 at 11:13 am
I am a bit curious about the question about selecting the component in the mixture – is the question about whether one can do this efficiently via a “pure inversion” approach? As you suggest, if you know the weights of the mixture, then the composition method should be suitable.
March 31, 2021 at 2:35 pm
Yes, it is hard to fathom a counterexample where inverting the cdf is less costly than drawing first the component index… Unless the inverse cdf F⁻¹ of an infinite mixture is available in closed form while the mixture amounts to a series decomposition of the original F, possibly with negative coefficients to make it harder?!