## random generators produce ties

“…an essential part of understanding how many ties these RNGs produce is to understand how many ties one expects in 32-bit integer arithmetic.”

**A** sort of a birthday-problem paper for random generators by Markus Hofert on arXiv as to why they produce ties. As shown for instance in the R code (inspired by the paper):

sum(duplicated(runif(1e6)))

returning values around 100, which is indeed unexpected until one thinks a wee bit about it… With no change if moving to an alternative to the Mersenne twister generator. Indeed, assuming the R random generators produce integers with 2³² values, the expected number of ties is actually 116 for 10⁶ simulations. Moving to 2⁶⁴, the probability of a tie is negligible, around 10⁻⁸. A side remark of further inerest in the paper is that, due to a different effective gap between 0 and the smallest positive normal number, of order 10⁻²⁵⁴ and between 1 and the smallest normal number greater than 1, of order 10⁻¹⁶, “the grid of representable double numbers is not equidistant”. Justifying the need for special functions such as expm1 and log1p, corresponding to more accurate derivations of exp(x)-1 and log(1+x).

April 22, 2020 at 8:15 am

[…] article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) […]

April 21, 2020 at 12:59 pm

In the birthday problem, there is 50% chance of a “match” in samples of size 23. For random samples of 32-bit uniform variates, there is a 50% chance of a duplicate for samples of size 77,000. For more details and graphs, see https://blogs.sas.com/content/iml/2013/07/03/duplicates-in-random-numbers.html

April 21, 2020 at 10:39 am

[…] by data_admin [This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page […]