Archive for exponential distribution

R rexp()

Posted in Books, R, Statistics with tags , , , , , , , on May 18, 2021 by xi'an

Following a question on X validated about the reasons for coding rexp() following Ahrens & Dieter (1972) version, I re-read Luc Devroye’s explanations. Which boils down to an optimised implementation of von Neumann’s Exponential generator. The central result is that, for any μ>0, M a Geometric variate with failure probability exp(-μ) and Z a positive Poisson variate with parameter μ

\mu(M+\min(U_1,\ldots,U_Z))

is distributed as an Exp(1) random variate. Meaning that for every scale μ, the integer part and the fractional part of an Exponential variate are independent, the former a Geometric. A refinement of the above consists in choosing

exp(-μ) =½

as the generation of M then consists in counting the number of 0’s before the first 1 in the binary expansion of UU(0,1). Actually the loop used in Ahrens & Dieter (1972) seems to be much less efficient than counting these 0’s

> benchmark("a"={u=runif(1)
    while(u<.5){
     u=2*u
     F=F+log(2)}},
  "b"={v=as.integer(rev(intToBits(2^31*runif(1))))
     sum(cumprod(!v))},
  "c"={sum(cumprod(sample(c(0,1),32,rep=T)))},
  "g"={rgeom(1,prob=.5)},replications=1e4)
  test elapsed relative user.self 
1    a  32.92  557.966    32.885
2    b  0.123    2.085     0.122
3    c  0.113    1.915     0.106
4    g  0.059    1.000     0.058

Obviously, trying to code the change directly in R resulted in much worse performances than the resident rexp(), coded in C.

on completeness

Posted in Books, Kids, Statistics with tags , , , , , , on November 19, 2020 by xi'an

Another X validated question that proved a bit of a challenge, enough for my returning to its resolution on consecutive days. The question was about the completeness of the natural sufficient statistic associated with a sample from the shifted exponential distribution

f(x;\theta) = \frac{1}{\theta^2}\exp\{-\theta^{-2}(x-\theta)\}\mathbb{I}_{x>\theta}

[weirdly called negative exponential in the question] meaning the (minimal) sufficient statistic is made of the first order statistic and of the sample sum (or average), or equivalently

T=(X_{(1)},\sum_{i=2}^n \{X_{(i)}-X_{(1)}\})

Finding the joint distribution of T is rather straightforward as the first component is a drifted Exponential again and the second a Gamma variate with n-2 degrees of freedom and the scale θ². (Devroye’s Bible can be invoked since the Gamma distribution follows from his section on Exponential spacings, p.211.) While the derivation of a function with constant expectation is straightforward for the alternate exponential distribution

f(x;\theta) = \frac{1}{\theta}\exp\{-\theta^{-1}(x-\theta)\}\mathbb{I}_{x>\theta}

since the ratio of the components of T has a fixed distribution, it proved harder for the current case as I was seeking a parameter free transform. When attempting to explain the difficulty on my office board, I realised I was seeking the wrong property since an expectation was enough. Removing the dependence on θ was simpler and led to

\mathbb E_\theta\left[\frac{X_{(1)}}{Y}-\frac{\Gamma(n-2)}{\Gamma(n-3/2)}Y^\frac{-1}{2}\right]=\frac{\Gamma(n-2)}{n\Gamma(n-1)}

but one version of a transform with fixed expectation. This also led me to wonder at the range of possible functions of θ one could use as scale and still retrieve incompleteness of T. Any power of θ should work but what about exp(θ²) or sin²(θ³), i.e. functions for which there exists no unbiased estimator..?

an elegant result on exponential spacings

Posted in Statistics with tags , , , , , , , , , , , , , on April 19, 2017 by xi'an

A question on X validated I spotted in the train back from Lyon got me desperately seeking a reference in Devroye’s Generation Bible despite the abyssal wireless and a group of screeching urchins a few seats away from me… The question is about why

\sum_{i=1}^{n}(Y_i - Y_{(1)}) \sim \text{Gamma}(n-1, 1)

when the Y’s are standard exponentials. Since this reminded me immediately of exponential spacings, thanks to our Devroye fan-club reading group in Warwick,  I tried to download Devroye’s Chapter V and managed after a few aborts (and a significant increase in decibels from the family corner). The result by Sukhatme (1937) is in plain sight as Theorem 2.3 and is quite elegant as it relies on the fact that

\sum_{i=1}^n y_i=\sum_{j=1}^n (n-j+1)(y_{(j)}-y_{(j-1)})=\sum_{j=2}^n (y_{(j)}-y_{(1)})

hence sums up as a mere linear change of variables! (Pandurang Vasudeo Sukhatme (1911–1997) was an Indian statistician who worked on human nutrition and got the Guy Medal of the RSS in 1963.)

Гнеде́нко and Forsythe [and e]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , on February 16, 2016 by xi'an

In the wake of my earlier post on the Monte Carlo estimation of e and e⁻¹, after a discussion with my colleague Murray Pollock (Warwick) Gnedenko’s solution, he pointed out another (early) Monte Carlo approximation called Forsythe’s method. That is detailed quite neatly in Luc Devroye’s bible, Non-uniform random variate generation (a free bible!). The idea is to run a sequence of uniform generations until the sequence goes up, i.e., until the latest uniform is larger than the previous one. The expectation of the corresponding stopping rule, N, which is the number of generations the uniform sequence went down continuously is then e, while the probability that N is odd is e⁻¹, most unexpectedly! Forsythe’s method actually aims at a much broader goal, namely simulating from any density of the form g(x) exp{-F(x)}, F taking values in (0,1). This method generalises von Neuman’s exponential generator (see Devroye, p.126) which only requires uniform generations.

gnevsforsThis is absolutely identical to Gnedenko’s approach in that both events have the same 1/n! probability to occur [as pointed out by Gérard Letac in a comment on the previous entry]. (I certainly cannot say whether or not one of the authors was aware of the other’s result: Forsythe generalised von Neumann‘s method around 1972, while Gnedenko published Theory of Probability at least in 1969, but this may be the date of the English translation, I have not been able to find the reference on the Russian wikipedia page…) Running a small R experiment to compare both distributions of N, the above barplot shows that they look quite similar:

n=1e6
use=runif(n)
# Gnedenko's in action:
gest=NULL
i=1
while (i<(n-100)){
sumuse=cumsum(use[i:(i+10)])
if (sumuse[11]<1]) 
  sumuse=cumsum(use[i:(i+100)]) 
j=min((1:length(sumuse))[sumuse>1])
gest=c(gest,j)
i=i+j}
#Forsythe's method:
fest=NULL
i=1
while (i<(n-100)){
sumuse=c(-1,diff(use[i:(i+10)]))
if (max(sumuse)<0]) 
  sumuse=c(-1,diff(use[i:(i+100)])) 
j=min((1:length(sumuse))[sumuse>0])
fest=c(fest,j)
i=i+j}

And the execution times of both approaches [with this rudimentary R code!] are quite close.

an a-statistical proof of a binomial identity

Posted in Books, Statistics, University life with tags , , , , on May 15, 2014 by xi'an

When waiting for Andrew this morning, I was browsing through the arXived papers of the day and came across this “Simple Statistical Proof of a Binomial Identity” by P. Vellaisamy. The said identity is that, for all s>0,

\sum_{k=0}^n (-1)^k {n \choose k}\dfrac{s}{s+k} = \prod_{k=1}^n \dfrac{k}{k+s}

Nothing wrong with the maths in this paper (except for a minor typo using Exp(1) instead of Exp(s), p.2).  But I am perplexed by the label “statistical” used by the author, as this proof is an entirely analytic argument, based on two different integrations of the same integral. Nothing connected with data or any statistical  technique: this is sheer combinatorics, of the kind one could find in William Feller‘s volume I.