## fast track & slow lane

Posted in Books, Kids, R, Statistics with tags , , , , on September 3, 2022 by xi'an

A riddle from the Riddler where N cars going at random (iid) speeds drive a road with a slow and a fast lane, each car choosing the fast lane iff any of the cars ahead in the slow lane is slowerthan them. With the question of the average number of car convoys.

If there were one single lane, the problem would be to determine how many times a smaller realisation and it has been solved in a much older Riddler. Namely

$\sum_{i=1}^N 1\big/ i$

In the two-lane system, the slow one only gathers cars with speeds lowest than the current speed, i.e. a decreasing sequence of speeds, creating single car convoys.  The fast lane thus captures cars that are above the current minimum in the sequence, which, as it converges to the global minimum at some points, means that all following cars are found in the fast lane. i thought this would bring the number of convoys close to twice the above logarithmic sum (which sealed my destiny during an entrance oral exam 40 years ago!), but there are actually more of them for N large enough , which may be due to the possibility of the slow lane to capture more moderate speed cars in the beginning… The above compares the number of convoys for one lane (in red) and two (in gold), as well as the remarkable fit when regressing these numbers against log(N).

Here is the R code I used for that simulation

convoy=function(N,T=1e5){
for(t in 1:T){
speed=runif(N)
slow=fast=NULL
slow=speed[1]
for(i in 2:N){
if(speed[i]<min(slow))slow=c(slow,speed[i])
else fast=c(fast,speed[i])}
F=F+length(slow)
if(length(fast)>0)F=F+1+sum(fast[-1]<cummin(fast)[-length(fast)])}
return(F/T)}


## optimal Gaussian zorbing

Posted in Books, Kids, R, Statistics with tags , , , , , , on August 30, 2022 by xi'an

A zorbing puzzle from the Riddler: cover the plane with four non-intersecting disks of radius one towards getting the highest probability (under the standard bivariate Normal distribution).

As I could not see a simple connection between the disks and the standard Normal, beyond the probability of a disk being given by a non-central chi-square cdf (with two degrees of freedom), I (once again) tried a random search by simulated annealing, which ended up with a configuration like the above, never above 0.777 using a pedestrian R code like

for(t in 1:1e6){# move the disk centres
Ap=A+vemp*rnorm(2)
Bp=B+vemp*rnorm(2)
while(dist(rbind(Ap,Bp))<2)Bp=B+vemp*rnorm(2)
Cp=C+vemp*rnorm(2)
while(min(dist(rbind(Ap,Bp,Cp)))<2)Cp=C+vemp*rnorm(2)
Dp=D+vemp*rnorm(2)
while(min(dist(rbind(Ap,Bp,Cp,Dp)))<2)Dp=D+vemp*rnorm(2)
#coverage probability
pp=pchisq(1,df=2,ncp=Ap%*%Ap)+pchisq(1,df=2,ncp=Bp%*%Bp)+
pchisq(1,df=2,ncp=Cp%*%Cp)+pchisq(1,df=2,ncp=Dp%*%Dp)
#simulated annealing step
if(log(runif(1))<(pp-p)/sqrt(temp)){
A=Bp;B=Cp;C=Dp;D=Ap;p=pp
if (sol\$val<p) sol=list(val=pp,pos=rbind(A,B,C,D))}
temp=temp*.9999}


I also tried a simpler configuration where all disk centres were equidistant from a reference centre, but this led to a lower “optimal” probability. I was looking forward the discussion of the puzzle, to discover if anything less brute-force was possible! But there was no deeper argument there beyond the elimination of other “natural” configurations (and missing the non-central χ² connection!). Among these options, having two disks tangent at (0,0) were optimal. But the illustration was much nicer:

## the Kelly criterion and then some

Posted in R, Statistics with tags , , , , , on August 26, 2022 by xi'an

The Kelly criterion is a way to optimise an unlimited sequence of bets under the following circumstances: a probability p of winning each bet, a loss of a fraction a of the sum bet, a gain of a fraction b of the sum bet, and a fraction f of the current fortune as the sum bet. Then

$f^*=\dfrac{p}{a}-\dfrac{1-p}{b}$

$\mathbb E[ \log\{X_n/X_0\}^{1/n}]$

Here is a rendering of the empirical probability of reaching 250 before ruin, when starting with a fortune of 100, when a=1, p=0.3 and f and b vary (on a small grid). With on top Kelly’s solutions, meaning that they achieve a high probability of avoiding ruin. Until they cannot.

The Ridder is asking for a variant of this betting scheme, when the probability p to win the bet is proportional to 1/(1+b), namely .9/(1+b). In that case, the probabilities of reaching 250 (using the same R code as above) before ruin are approximated as followswith a maximal probability that does not exceed 0.36, the probability to win in one go, betting 100 with a goal of 250. It thus may be that the optimal choice, probabilitiwise is that one. Note that in that case, whatever the value of b, the Kelly criterion returns a negative fraction. Actually, the solution posted by the Riddler the week after is slightly above, 0.3686 or 1−(3/5)9/10. Which I never reached by the sequential bet of a fixed fraction of the current fortune, eps. when not accounting for the fact that aiming at 250 rather than a smaller target was removing a .9 factor.

## the other side of the dice

Posted in Books, Kids, pictures with tags , , , on August 19, 2022 by xi'an

A simple riddle from the Riddler: if a standard fair six-faced dice is falling on an edge rather than a face, what is the expectation of the sum of the faces sharing this edge?

The solution proposed there is however somewhat convoluted, when the average is simply

$\frac{1}{6}\sum_{i=1}^6 \{i+\frac{1+\cdots+6}{4}-\frac{i+7-i}{4}\}=7$

since the only face that does not share an edge with face i is 7-i…

## shelled and riddled

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , on August 10, 2022 by xi'an

Consider a shell game with three shells and a ball with The Riddler constraint that the location of the shell with the ball is always exchanged with the location of an empty shell, randomly chosen. If one starts with the ball as rightmost, what is the distribution of the location of the ball after N steps?

Running an exploratory R code like

o=rep(0,3)
for(n in 1:1e6){
b=c(0,0,1)
for(t in 1:N){
i=sample((1:3)[!b],1);b=0*b;b[i]=1}
o=o+b}


shows that the difference in probability is between the rightmost position and both others, starting at zero, and evolving as p⁺=(1-p⁻)/2, with the successive values 0,1/2,1/4,3/8,5/15,11/32,… Very quickly converging to 1/3.