When experimenting with various quantiles functions in R, I was shocked [ok this is a bit excessive, let us say surprised] by how widely the execution times would vary. To the point of blaming a completely different feature of R. Borrowing from Charlie Geyer’s webpage on the topic of probability distributions in R, here is a table for some standard distributions: I ran
u=runif(1e7) system.time(x<-qcauchy(u))
choosing an arbitrary parameter whenever needed.
Distribution | Function | Time |
---|---|---|
Cauchy | qcauchy |
2.2 |
Chi-Square | qchisq |
43.8 |
Exponential | qexp |
0.95 |
F | qf |
34.2 |
Gamma | qgamma |
37.2 |
Logistic | qlogis |
1.7 |
Log Normal | qlnorm |
2.2 |
Normal | qnorm |
1.4 |
Student t | qt |
31.7 |
Uniform | qunif |
0.86 |
Weibull | qweibull |
2.9 |
Of course, it does not mean much in that all the slow distributions (except for Weibull) are parameterised. Nonetheless, that a chi-square inversion take 50 times longer than a uniform inversion remains puzzling as to why it is not coded more efficiently. In particular, I was wondering why the chi-square inversion was slower than the Gamma inversion. Rerunning both inversions showed that they are equivalent:
> u=runif(1e7) > system.time(x<-qgamma(u,sha=1.5)) utilisateur système écoulé 21.534 0.016 21.532 > system.time(x<-qchisq(u,df=3)) utilisateur système écoulé 21.372 0.008 21.361
Which also shows how variable system.time can be.