## an elegant result on exponential spacings

Posted in Statistics with tags , , , , , , , , , , , , , on April 19, 2017 by xi'an

A question on X validated I spotted in the train back from Lyon got me desperately seeking a reference in Devroye’s Generation Bible despite the abyssal wireless and a group of screeching urchins a few seats away from me… The question is about why

$\sum_{i=1}^{n}(Y_i - Y_{(1)}) \sim \text{Gamma}(n-1, 1)$

when the Y’s are standard exponentials. Since this reminded me immediately of exponential spacings, thanks to our Devroye fan-club reading group in Warwick,  I tried to download Devroye’s Chapter V and managed after a few aborts (and a significant increase in decibels from the family corner). The result by Sukhatme (1937) is in plain sight as Theorem 2.3 and is quite elegant as it relies on the fact that

$\sum_{i=1}^n y_i=\sum_{j=1}^n (n-j+1)(y_{(j)}-y_{(j-1)})=\sum_{j=2}^n (y_{(j)}-y_{(1)})$

hence sums up as a mere linear change of variables! (Pandurang Vasudeo Sukhatme (1911–1997) was an Indian statistician who worked on human nutrition and got the Guy Medal of the RSS in 1963.)

## complexity of the von Neumann algorithm

Posted in Statistics with tags , , , , , , , , , on April 3, 2017 by xi'an

“Without the possibility of computing infimum and supremum of the density f over compact subintervals of the domain of f, sampling absolutely continuous distribution using the rejection method seems to be impossible in total generality.”

The von Neumann algorithm is another name for the rejection method introduced by von Neumann circa 1951. It was thus most exciting to spot a paper by Luc Devroye and Claude Gravel appearing in the latest Statistics and Computing. Assessing the method in terms of random bits and precision. Specifically, assuming that the only available random generator is one of random bits, which necessarily leads to an approximation when the target is a continuous density. The authors first propose a bisection algorithm for distributions defined on a compact interval, which compares random bits with recursive bisections of the unit interval and stops when the interval is small enough.

In higher dimension, for densities f over the unit hypercube, they recall that the original algorithm consisted in simulating uniforms x and u over the hypercube and [0,1], using the uniform as the proposal distribution and comparing the density at x, f(x), with the rescaled uniform. When using only random bits, the proposed method is based on a quadtree that subdivides the unit hypercube into smaller and smaller hypercubes until the selected hypercube is entirely above or below the density. And is small enough for the desired precision. This obviously requires for the computation of the upper and lower bound of the density over the hypercubes to be feasible, with Devroye and Gravel considering that this is a necessary property as shown by the above quote. Densities with non-compact support can be re-expressed as densities on the unit hypercube thanks to the cdf transform. (Actually, this is equivalent to the general accept-reject algorithm, based on the associated proposal.)

“With the oracles introduced in our modification of von Neumann’s method, we believe that it is impossible to design a rejection algorithm for densities that are not Riemann-integrable, so the question of the design of a universally valid rejection algorithm under the random bit model remains open.”

In conclusion, I enjoyed very much reading this paper, especially the reflection it proposes on the connection between Riemann integrability and rejection algorithms. (Actually, I cannot think straight away of a simulation algorithm that would handle non-Riemann-integrable densities, apart from nested sampling. Or of significant non-Riemann-integrable densities.)

## pitfalls of nested Monte Carlo

Posted in Books, pictures, Statistics, University life with tags , , , , , on December 19, 2016 by xi'an

A few days ago, Tom Rainforth, Robert Cornish, Hongseok Yang, and Frank Wood from Oxford have arXived a paper on the limitations of nested Monte Carlo. By nested Monte Carlo [not nested sampling], they mean Monte Carlo techniques used to evaluate the expectation of a non-linear transform of an expectation, which often call for plug-in resolution. The main result is that this expectation cannot be evaluated by an unbiased estimator. Which is only mildly surprising. I do wonder if there still exist series solutions à la Glynn and Rhee, as in the Russian roulette version. Which is mentioned in a footnote. Or specially tuned versions, as suggested by some techniques found in Devroye’s book where the expectation of the exponential of another expectation is considered… (The paper is quite short, which may be correlated with the format imposed by some machine-learning conference proceedings like AISTATS.)

## ratio-of-uniforms [-1]

Posted in Books, pictures, R, Statistics, University life with tags , , , on December 12, 2016 by xi'an

Luca Martino pointed out to me my own and forgotten review of a 2012 paper of his, “On the Generalized Ratio of Uniforms as a Combination of Transformed Rejection and Extended Inverse of Density Sampling” that obviously discusses a generalised version of Kinderman and Monahan’s (1977) ratio-of-uniform method. And further points out the earlier 1991 paper by Jon Wakefield, Alan Gelfand and Adrian Smith that contains the general form I rediscovered a few posts ago… Called the GRoU in Martino et al.. While the generalisation in the massive arXiv document is in finding Φ such that the above region is bounded and can be explored by uniform sampling over a box.

Neither reference mentions using the cdf transform, though, which does guarantee a bounded ratio-of-uniform set in u. (An apparent contradiction with Martino et al.  statement (34), unless I am confused. Maybe due to using Φ⁻¹ instead of Φ?) But I still wonder at the usefulness of my derivations those past weeks!

## ratio-of-uniforms [#4]

Posted in Books, pictures, R, Statistics, University life with tags , , , , on December 2, 2016 by xi'an

Possibly the last post on random number generation by Kinderman and Monahan’s (1977) ratio-of-uniform method. After fiddling with the Gamma(a,1) distribution when a<1 for a while, I indeed figured out a way to produce a bounded set with this method: considering an arbitrary cdf Φ with corresponding pdf φ, the uniform distribution on the set Λ of (u,v)’s in R⁺xX such that

0≤u≤Φοƒ[φοΦ⁻¹(u)v]

induces the distribution with density proportional to ƒ on φοΦ⁻¹(U)V. This set Λ has a boundary that is parameterised as

u=Φοƒ(x),  v=1/φοƒ(x), x∈Χ

which remains bounded in u since Φ is a cdf and in v if φ has fat enough tails. At both 0 and ∞. When ƒ is the Gamma(a,1) density this can be achieved if φ behaves like log(x)² near zero and like a inverse power at infinity. Without getting into all the gory details, closed form density φ and cdf Φ can be constructed for all a’s, as shown for a=½ by the boundaries in u and v (yellow) below

which leads to a bounded associated set Λ

At this stage, I remain uncertain of the relevance of such derivations, if only because the set A thus derived is ill-suited for uniform draws proposed on the enclosing square box. And also because a Gamma(a,1) simulation can rather simply be derived from a Gamma(a+1,1) simulation. But, who knows?!, there may be alternative usages of this representation, such as innovative slice samplers. Which means the ratio-of-uniform method may reappear on the ‘Og one of those days…

## simulation by hand

Posted in Books, Kids, pictures, Statistics, Travel with tags , , , , , , , on November 28, 2016 by xi'an

A rather weird question on X validated this week was about devising a manual way to simulate (a few) normal variates. By manual I presume the author of the question means without resorting to a computer or any other business machine. Now, I do not know of any real phenomenon that is exactly and provably Normal. As analysed in a great philosophy of science paper by Aidan Lyon, the standard explanations for a real phenomenon to be Normal are almost invariably false, even those invoking the Central Limit Theorem. Hence I cannot think of a mechanical device that would directly return Normal generations from a Normal distribution with known parameters. However, since it is possible to simulate by hand Uniform U(0,1) variates [up to a given precision] using a chronometre or a wheel, calls to versions of the Box-Müller algorithm that do not rely on logarithmic or trigonometric functions are feasible, for instance by generating two Exponential variates, x and y, until 2y>(1-x)², x being the output. And generating Exponential variates is easy provided a radioactive material with known half-life is available, along with a Geiger counter. Or, if not, by calling von Neumann’s exponential generator. As detailed in Devroye’s simulation book.

After proposing this solution, I received a comment from the author of the question towards a simpler solution based, e.g., on the Central Limit Theorem. Presumably for simple iid random variables such as coin tosses or dice experiments. While I used the CLT for simulating Normal variables in my very early days [just after programming on punched cards!], I do not think this is a very good or efficient method, as the tails grow very slowly to normality. By comparison, using the same amount of coin tosses to create a sufficient number of binary digits of a Uniform variate produces a computer-precision exact Uniform variate, which can be exploited in Box-Müller-like algorithms to return exact Normal variates… Even by hand if necessary. [For some reason, this question attracted a lot of traffic and an encyclopaedic answer on X validated, despite being borderline to the point of being proposed for closure.]

## variance of an exponential order statistics

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , on November 10, 2016 by xi'an

This afternoon, one of my Monte Carlo students at ENSAE came to me with an exercise from Monte Carlo Statistical Methods that I did not remember having written. And I thus “charged” George Casella with authorship for that exercise!

Exercise 3.3 starts with the usual question (a) about the (Binomial) precision of a tail probability estimator, which is easy to answer by iterating simulation batches. Expressed via the empirical cdf, it is concerned with the vertical variability of this empirical cdf. The second part (b) is more unusual in that the first part is again an evaluation of a tail probability, but then it switches to find the .995 quantile by simulation and produce a precise enough [to three digits] estimate. Which amounts to assess the horizontal variability of this empirical cdf.

As we discussed about this question, my first suggestion was to aim at a value of N, number of Monte Carlo simulations, such that the .995 x N-th spacing had a length of less than one thousandth of the .995 x N-th order statistic. In the case of the Exponential distribution suggested in the exercise, generating order statistics is straightforward, since, as suggested by Devroye, see Section V.3.3, the i-th spacing is an Exponential variate with rate (N-i+1). This is so fast that Devroye suggests simulating Uniform order statistics by inverting Exponential order statistics (p.220)!

However, while still discussing the problem with my student, I came to a better expression of the question, which was to figure out the variance of the .995 x N-th order statistic in the Exponential case. Working with the density of this order statistic however led nowhere useful. A bit later, after Google-ing the problem, I came upon this Stack Exchange solution that made use of the spacing result mentioned above, namely that the expectation and variance of the k-th order statistic are

$\mathbb{E}[X_{(k)}]=\sum\limits_{i=N-k+1}^N\frac1i,\qquad \mbox{Var}(X_{(k)})=\sum\limits_{i=N-k+1}^N\frac1{i^2}$

which leads to the proper condition on N when imposing the variability constraint.