## birthday paradox, one by one

Posted in Books, Kids, pictures, Statistics with tags , , , , , on November 1, 2022 by xi'an

An easy riddle, riding the birthday paradox. Namely how many people on average have to sequentially enter a room for a double birthday to occur?

$\sum_{n=2}^{366}n\prod_{m=1}^{n-2}\left(1-\frac{m}{365}\right)\frac{n-1}{365}$

as only the last person in can share a birthday with those already in. This sum equals 24.62. Since the “magic number” is 23, this may sound wrong but the median attached to the above distribution is truly 23. My R code is

q=rep(1,a<-366)/365
for(n in 3:a)q[n]=q[n-1]*(1+1/(n-2)-(n-1)/365)
sum(q[-1]*(2:a))


## Kempner Fi

Posted in Books, Kids, R, Statistics with tags , , , , , , , on January 19, 2021 by xi'an

A short code-golf challenge led me to learn about the Kempner series, which is the series made of the inverted integers, excluding all those containing the digit 9. Most surprisingly this exclusion is enough to see the series converging (close to 23). The explanation for this convergence is that, citing Wikipedia,

“The number of n-digit positive integers that have no digit equal to ‘9’ is 8 × 9n−1

and since the inverses of these n-digit positive integers are less than 101−n the series is bounded by 80. In simpler terms, it converges because the fraction of remaining terms in the series is geometrically decreasing as (9/10)1−n. Unsurprisingly (?) the series is also atrociously slow to converge (for instance the first million terms sum up to 11) and there exist recurrence representations that speed up its computation.  Here is the code-golf version

n=scan()+2while(n<-n-1){  F=F+1/T  while(grepl(9,T<-T+1))0}F


that led me to learn about the R function grepl. (The explanation for the pun in the title is that Semper Fidelis is the motto of the corsair City of Saint-Malo or Sant-Maloù, Brittany.)

## puzzles & riddles

Posted in Books, Kids, R, Statistics with tags , , , , , , , , on January 3, 2021 by xi'an

A rather simplistic game on the Riddler of 18 December:

…two players, each of whom starts with a whole number of points. Players take turns “attacking” each other, which involves subtracting their own number of points from their opponent’s until one of the players is out of points.

Easy to code in R:

g=function(N,M)ifelse(N<M,-g(M-N,N),1)

f=function(N,M=N){
while(g(N,M)>0)M=M+1
return(M)}


which converges to the separating ratio 1.618. If decomposing the actions until one player wins, one gets a sequence of upper and lower bounds associated with the Fibonacci sequence: 1⁻, 2⁺, 3/2⁻, 5/3⁺, 8/5⁻, &tc, converging to the “golden ratio” φ.

As an aside, I also solved a relatively quick codegolf challenge, where the question was to find the sum of all possible binary values from a bitmask. Meaning that for a binary input, e.g., 101X0XX0…01X, with some entries masked by X’s, one had to find the sum of all binary numbers compatible with the input. Which can be solved succinctly by counting the number of X’s, k, and adding the visible bits $2^k$ times and replacing the invisible ones by  $2^{k-1}$. With some help, and using 2 instead of X, my R code moved from 158 bytes to 50:

function(x)2^sum(a<-x>1)*rev(x/4^a)%*%2^(seq(x)-1)


## a new Monty Hall riddle

Posted in Books, Kids, Mountains, pictures, R, Statistics, Travel with tags , , , , , , , , , on May 22, 2020 by xi'an

The Riddler was sort of feeling the rising boredom of being under lockdown when proposing the following variant to the Monty Hall puzzle:

There are zero to three goats, with a probability ¼ each, and they are allocated to different doors uniformly among the three doors of the show. After the player chooses a door, Monty opens another door hidding a goat or signals this is impossible. Given that he did open a door, what is the probability that the player’s door does not hide a goat?

Indeed, a straightforward conditional probability computation considering all eight possible cases with the four cases corresponding to Monty opening a door leads to a probability of 3/8 for the player’s door. As confirmed by the following R code:

s=sample
m=c(0,0)
for(t in 1:1e6)m=m+(range(s(1:3,s(1:3,1)))>1)


## Metropolis in 95 characters

Posted in pictures, R, Statistics, Travel with tags , , , , , , , , on January 2, 2020 by xi'an

Here is an R function that produces a Metropolis-Hastings sample for the univariate log-target f when the later is defined outside as another function. And when using a Gaussian random walk with scale one as proposal. (Inspired from a X validated question.)

m<-function(T,y=rnorm(1))ifelse(rep(T>1,T),
c(y*{f({z<-m(T-1)}[1])-f(y+z[1])<rexp(1)}+z[1],z),y)


The function is definitely not optimal, crashes for values of T larger than 580 (unless one modifies the stack size), and operates the most basic version of a Metropolis-Hastings algorithm. But as a codegolf challenge (on a late plane ride), this was a fun exercise.