scale acceleration
Kate Lee pointed me to a rather surprising inefficiency in matlab, exploited in Sylvia Früwirth-Schnatter’s bayesf package: running a gamma simulation by rgamma(n,a,b) takes longer and sometimes much longer than rgamma(n,a,1)/b, the latter taking advantage of the scale nature of b. I wanted to check on my own whether or not R faced the same difficulty, so I ran an experiment [while stuck in a Thalys train at Brussels, between Amsterdam and Paris…] Using different values for a [click on the graph] and a range of values of b. To no visible difference between both implementations, at least when using system.time for checking.
a=seq(.1,4,le=25) for (t in 1:25) a[t]=system.time( rgamma(10^7,.3,a[t]))[3] a=a/system.time(rgamma(10^7,.3,1))[3]
Once arrived home, I wondered about the relevance of the above comparison, since rgamma(10^7,.3,1) forces R to use 1 as a scale, which may differ from using rgamma(10^7,.3), where 1 is known to be the scale [does this sentence make sense?!]. So I rerun an even bigger experiment as
a=seq(.1,4,le=25) for (t in 1:25) a[t]=system.time( rgamma(10^8,.3,a[t]))[3] a=a/system.time(rgamma(10^8,.3))[3]
and got the graph below. Which is much more interesting because it shows that some values of a are leading to a loss of efficiency of 50%. Indeed. (The most extreme cases correspond to a=0.3, 1.1., 5.8. No clear pattern emerging.)Update
As pointed out by Martyn Plummer in his comment, the C function behind the R rgamma function and Gamma generator does take into account the scale nature of the second parameter, so the above time differences are not due to this function but rather to whatever my computer was running at the same time…! Apologies to anyone I scared with this void warning!
April 25, 2015 at 6:02 pm
Sorry for the incorrect information. Matlab function, gamrnd.m, also takes into account the scale nature of the second parameter…
http://www.mathworks.com/matlabcentral/fileexchange/9554-a-numerical-tour-of-signal-processing/content/numerical-tour/toolbox_general/gamrnd.m
April 24, 2015 at 3:02 pm
I cannot reproduce your results, and I see no reason why post-hoc scaling of gamma random variables should be any faster in R than supplying the scale parameter directly. The workhorse function here is rgamma in src/nmath/rgamma.c, which generates random variables for scale=1 and then multiplies the result by the scale parameter on exit.
April 24, 2015 at 8:59 pm
Martyn, sorry that you cannot reproduce, but I truly and honestly ran the code posted! I trust your comment that rgamma.c is coded for a scale of one, which means my variations from 1 are purely random and indicative of other activities on my computer rather than delayed by an inefficient accept reject algorithm…
April 24, 2015 at 12:20 am
As part of my Ph.D. research I coded up a Gibbs sampler in Matlab; that rgamma trick saved me a small but appreciable amount of computation time.