## laser sharp random number generator

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , on April 1, 2021 by xi'an

Caught the headline of Science News on a super-fast random number generator based on a dysfunctional laser! Producing “254 trillion random digits per second”.

“…when the laser is shined on a surface, its light contains a constantly changing pattern of tiny pinpricks that brighten and dim randomly. The brightness at each spot in the pattern over time can be translated by a computer into a random series of ones and zeros.”

I presume this is covered in the original Science paper [which I cannot access] but the parallel series of 0’s and 1’s should be checked to produce independent Bernoulli B(½) variates before being turned into a genuine random number generator.

## inverse Gaussian trick [or treat?]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , , , , , , , on October 29, 2020 by xi'an

When preparing my mid-term exam for my undergrad mathematical statistics course, I wanted to use the inverse Gaussian distribution IG(μ,λ) as an example of exponential family and include a random generator question. As shown above by a Fortran computer code from Michael, Schucany and Haas, a simple version can be based on simulating a χ²(1) variate and solving in x the following second degree polynomial equation

$\dfrac{\lambda(x-\mu)^2}{\mu^2 x} = v$

since the left-hand side transform is distributed as a χ²(1) random variable. The smallest root x¹, less than μ, is then chosen with probability μ/(μ+x¹) and the largest one, x²=μ²/x¹ with probability x¹/(μ+x¹). A relatively easy question then, except when one considers asking for the proof of the χ²(1) result, which proved itself to be a harder cookie than expected! The paper usually referred to for the result, Schuster (1968), is quite cryptic on the matter, essentially stating that the above can be expressed as the (bijective) transform of Y=min(X,μ²/X) and that V~χ²(1) follows immediately. I eventually worked out a proof by the “law of the unconscious statistician” [a name I do not find particularly amusing!], but did not include the question in the exam. But I found it fairly interesting that the inverse Gaussian can be generating by “inverting” the above equation, i.e. going from a (squared) Gaussian variate V to the inverse Gaussian variate X. (Even though the name stems from the two cumulant generating functions being inverses of one another.)

## random generators produce ties

Posted in Books, R, Statistics with tags , , , , , , , on April 21, 2020 by xi'an

“…an essential part of understanding how many ties these RNGs produce is to understand how many ties one expects in 32-bit integer arithmetic.”

A sort of a birthday-problem paper for random generators by Markus Hofert on arXiv as to why they produce ties. As shown for instance in the R code (inspired by the paper):

sum(duplicated(runif(1e6)))


returning values around 100, which is indeed unexpected until one thinks a wee bit about it… With no change if moving to an alternative to the Mersenne twister generator. Indeed, assuming the R random generators produce integers with 2³² values, the expected number of ties is actually 116 for 10⁶ simulations. Moving to 2⁶⁴, the probability of a tie is negligible, around 10⁻⁸. A side remark of further inerest in the paper is that, due to a different effective gap between 0 and the smallest positive normal number, of order 10⁻²⁵⁴ and between 1 and the smallest normal number greater than 1, of order 10⁻¹⁶, “the grid of representable double numbers is not equidistant”. Justifying the need for special functions such as expm1 and log1p, corresponding to more accurate derivations of exp(x)-1 and log(1+x).

## biased sample!

Posted in Statistics with tags , , , , , , , , , , , on May 21, 2019 by xi'an

A chance occurrence led me to this thread on R-devel about R sample function generating a bias by taking the integer part of the continuous uniform generator… And then to the note by Kellie Ottoboni and Philip Stark analysing the reason, namely the fact that R uniform [0,1) pseudo-random generator is not perfectly continuously uniform but discrete, by the nature of numbers on a computer. Knuth (1997) showed that in this case the range of probabilities is larger than (1,1), the largest range being (1,1.03). As noted in the note, exploiting directly the pseudo-random bits of the pseudo-random generator. Shocking, isn’t it!  A fast and bias-free alternative suggested by Lemire is available as dqsample::sample

As an update of June 2019, sample is now fixed.

## independent random sampling methods [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on May 16, 2018 by xi'an

Last week, I had the pleasant surprise to receive a copy of this book in the mail. Book that I was not aware had been written or published (meaning that I was not involved in its review!). The three authors, Luca Martino, David Luengo, and Joaquín Míguez, of Independent Random Sampling Methods are from Madrid universities and I have read (and posted on) several of their papers on (population) Monte Carlo simulation in the recent years. Including Luca’s survey of multiple try MCMC which was helpful in writing our WIREs own survey.

The book is a pedagogical coverage of most algorithms used to simulate independent samples from a given distribution, which of course recoups some of the techniques exposed with more details by [another] Luc, namely Luc Devroye’s Non-uniform random variate generation bible, often mentioned here (and studied in uttermost details by a dedicated reading group in Warwick). It includes a whole chapter on accept-reject methods, with in particular a section on Payne-Dagpunar’s band rejection I had not seen previously. And another entire chapter on ratio-of-uniforms techniques. On which the three authors had proposed generalisations [covered by the book], years before I attempted to go the same way, having completely forgotten reading their paper at the time… Or the much earlier 1991 paper by Jon Wakefield, Alan Gelfand and Adrian Smith!

The book also covers the “vertical density representation”, due to Troutt (1991), which consists in considering the distribution of the density p(.) of the random variable X as a random variable, p(X). I remember pondering about this alternative to the cdf transform and giving up on it as the outcome has a distribution depending on p, even when the density is monotonous. Even though I am not certain from reading the section that this is particularly appealing…

Given its title, the book contains very little about MCMC. Except for a last and final chapter that covers adaptive independent Metropolis-Hastings algorithms, in connection with some of the authors’ recent work. Like multiple try Metropolis. Relating to the (unidimensional) ARMS “ancestor” of adaptive MCMC methods. (As noted in a recent blog on Holden et al., 2009 , I have trouble understanding how recycling only rejected proposed values to build a better proposal distribution is enough to guarantee convergence of an adaptive algorithm, but the book does not delve much into this convergence.)

All in all and with the bias induced by me working in the very area, I find the book quite a nice entry on the topic, which can be used in a Monte Carlo course at both undergraduate and graduate levels if one want to avoid going into Markov chains. It is certainly less likely to scare students away than the comprehensive Non-uniform random variate generation and on the opposite may induce some of them to pursue a research career in this domain.