Archive for Wordpress

hard birthday problem

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , on February 4, 2021 by xi'an

Click to access birthday.pdf

From an X validated question, found that WordPress now allows for direct link to pdf documents, like the above paper by my old friend Anirban Das Gupta! The question is about estimating a number M of individuals with N distinct birth dates over a year of T days. After looking around I could not find a simpler representation of the probability for N=r other than (1) in my answer,

\frac{T!}{(\bar N-r)!}\frac{m!}{T^m}  \sum_{(r_1,\ldots,r_m);\\\sum_1^m r_i=r\ \&\\\sum_1^m ir_i=m}1\Big/\prod_{j=1}^m r_j! (j!)^{r_j}

borrowed from a paper by Fisher et al. (Another Fisher!) Checking Feller leads to the probability (p.102)

{T \choose r}\sum_{\nu=0}^r (-1)^{\nu}{r\choose\nu}\left(1-\frac{T-r+\nu}T \right)^m

which fits rather nicely simulation frequencies, as shown using

apply(!apply(matrix(sample(1:Nb,T*M,rep=TRUE),T,M),1,duplicated),2,sum)

Further, Feller (1970, pp.103-104) justifies an asymptotic Poisson approximation with parameter$

\lambda(M)=\bar{N}\exp\{-M/\bar N\}

from which an estimate of $M$ can be derived. With the birthday problem as illustration (pp.105-106)!

It may be that a completion from N to (R¹,R²,…) where the components are the number of days with one birthdate, two birthdates, &tc. could help design an EM algorithm that would remove the summation in (1) but I did not spend more time on the problem (than finding a SAS approximation to the probability!).

surg’Og interest from Serbia

Posted in Statistics with tags , , , , on January 15, 2018 by xi'an

2014 in review

Posted in Statistics with tags , , , , on January 2, 2015 by xi'an

The WordPress.com stats helper monkeys prepared a 2014 annual report for the ‘Og…

.. and among the collected statistics for 2014, what I found most amazing are the three accesses from Greenland and the one access from Afghanistan!

Click here to see the complete report. (Assuming you have nothing better to do on Boxing day…)

unicode in LaTeX

Posted in Books, Linux, Statistics, University life with tags , , , , , , on October 9, 2014 by xi'an

As I was hurriedly trying to cram several ‘Og posts into a conference paper (!), I looked around for a way of including Unicode characters straight away. And found this solution on StackExchange:

\usepackage[mathletters]{ucs}
\usepackage[utf8x]{inputenc}

which just suited me fine!

RSS statistical analytics challenge 2014

Posted in Kids, R, Statistics, University life, Wines with tags , , , , on May 2, 2014 by xi'an

RSS_Challenge_2014Great news! The RSS is setting a data analysis challenge this year, sponsored by the Young Statisticians Section and Research Section of the Royal Statistical Society: Details are available on the wordpress website of the Challenge. Registration is open and the Challenge goes live on Tuesday 6 May 2014 for an exciting 6 weeks competition. (A wee bit of an unfortunate timing for those of us considering submitting a paper to NIPS!) Truly terrific, I have been looking for this kind of event to happen for many years (without finding the momentum to set it rolling…)  and hope it will generate a lot of exciting activity and replicas in other societies.