## hard birthday problem

From an X validated question, found that WordPress now allows for direct link to pdf documents, like the above paper by my old friend Anirban Das Gupta! The question is about estimating a number M of individuals with N distinct birth dates over a year of T days. After looking around I could not find a simpler representation of the probability for N=r other than (1) in my answer,

borrowed from a paper by Fisher et al. (Another Fisher!) Checking Feller leads to the probability (p.102)

which fits rather nicely simulation frequencies, as shown using

apply(!apply(matrix(sample(1:Nb,T*M,rep=TRUE),T,M),1,duplicated),2,sum)

Further, Feller (1970, pp.103-104) justifies an asymptotic Poisson approximation with parameter$

from which an estimate of $M$ can be derived. With the birthday problem as illustration (pp.105-106)!

It may be that a completion from N to (R¹,R²,…) where the components are the number of days with one birthdate, two birthdates, &tc. could help design an EM algorithm that would remove the summation in (1) but I did not spend more time on the problem (than finding a SAS approximation to the probability!).

February 4, 2021 at 8:11 am

[…] article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) […]

February 4, 2021 at 1:05 am

[…] by data_admin [This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page […]