Archive for R

bayess’ back! [on CRAN]

Posted in Books, R, Statistics, University life with tags , , , , , , , on September 22, 2022 by xi'an

optimal Gaussian zorbing

Posted in Books, Kids, R, Statistics with tags , , , , , , on August 30, 2022 by xi'an

A zorbing puzzle from the Riddler: cover the plane with four non-intersecting disks of radius one towards getting the highest probability (under the standard bivariate Normal distribution).

As I could not see a simple connection between the disks and the standard Normal, beyond the probability of a disk being given by a non-central chi-square cdf (with two degrees of freedom), I (once again) tried a random search by simulated annealing, which ended up with a configuration like the above, never above 0.777 using a pedestrian R code like

for(t in 1:1e6){# move the disk centres
 Ap=A+vemp*rnorm(2)
 Bp=B+vemp*rnorm(2)
 while(dist(rbind(Ap,Bp))<2)Bp=B+vemp*rnorm(2)
 Cp=C+vemp*rnorm(2)
 while(min(dist(rbind(Ap,Bp,Cp)))<2)Cp=C+vemp*rnorm(2)
 Dp=D+vemp*rnorm(2)
 while(min(dist(rbind(Ap,Bp,Cp,Dp)))<2)Dp=D+vemp*rnorm(2)
 #coverage probability
 pp=pchisq(1,df=2,ncp=Ap%*%Ap)+pchisq(1,df=2,ncp=Bp%*%Bp)+
    pchisq(1,df=2,ncp=Cp%*%Cp)+pchisq(1,df=2,ncp=Dp%*%Dp)
 #simulated annealing step
 if(log(runif(1))<(pp-p)/sqrt(temp)){
   A=Bp;B=Cp;C=Dp;D=Ap;p=pp
   if (sol$val<p) sol=list(val=pp,pos=rbind(A,B,C,D))}
 temp=temp*.9999}

I also tried a simpler configuration where all disk centres were equidistant from a reference centre, but this led to a lower “optimal” probability. I was looking forward the discussion of the puzzle, to discover if anything less brute-force was possible! But there was no deeper argument there beyond the elimination of other “natural” configurations (and missing the non-central χ² connection!). Among these options, having two disks tangent at (0,0) were optimal. But the illustration was much nicer:

the Kelly criterion and then some

Posted in R, Statistics with tags , , , , , on August 26, 2022 by xi'an

The Kelly criterion is a way to optimise an unlimited sequence of bets under the following circumstances: a probability p of winning each bet, a loss of a fraction a of the sum bet, a gain of a fraction b of the sum bet, and a fraction f of the current fortune as the sum bet. Then

f^*=\dfrac{p}{a}-\dfrac{1-p}{b}

is the fraction optimising the growth

\mathbb E[ \log\{X_n/X_0\}^{1/n}]

Here is a rendering of the empirical probability of reaching 250 before ruin, when starting with a fortune of 100, when a=1, p=0.3 and f and b vary (on a small grid). With on top Kelly’s solutions, meaning that they achieve a high probability of avoiding ruin. Until they cannot.

The Ridder is asking for a variant of this betting scheme, when the probability p to win the bet is proportional to 1/(1+b), namely .9/(1+b). In that case, the probabilities of reaching 250 (using the same R code as above) before ruin are approximated as followswith a maximal probability that does not exceed 0.36, the probability to win in one go, betting 100 with a goal of 250. It thus may be that the optimal choice, probabilitiwise is that one. Note that in that case, whatever the value of b, the Kelly criterion returns a negative fraction. Actually, the solution posted by the Riddler the week after is slightly above, 0.3686 or 1−(3/5)9/10. Which I never reached by the sequential bet of a fixed fraction of the current fortune, eps. when not accounting for the fact that aiming at 250 rather than a smaller target was removing a .9 factor.

shelled and riddled

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , on August 10, 2022 by xi'an

Consider a shell game with three shells and a ball with The Riddler constraint that the location of the shell with the ball is always exchanged with the location of an empty shell, randomly chosen. If one starts with the ball as rightmost, what is the distribution of the location of the ball after N steps?

Running an exploratory R code like

o=rep(0,3)
for(n in 1:1e6){
  b=c(0,0,1)
  for(t in 1:N){
    i=sample((1:3)[!b],1);b=0*b;b[i]=1}
  o=o+b}

shows that the difference in probability is between the rightmost position and both others, starting at zero, and evolving as p⁺=(1-p⁻)/2, with the successive values 0,1/2,1/4,3/8,5/15,11/32,… Very quickly converging to 1/3.

Goats do room

Posted in Books, Kids, R, Statistics, Wines with tags , , , , , , , , , , , , on July 16, 2022 by xi'an

The riddle of the week is about 10 goats sequentially moving to their room, which they have chosen at random and independently (among ten rooms), unless another goat already occupies the room, in which case they move to the first free room with a higher number or fail. What is the probability that all goats end up in a room?

Coding the experiment is straightforward:

g=sample(1:N,N,rep=TRUE)
o=0*g
for(i in 1:N){
    if(min(o[g[i]:N])){f=f+1;break()
    }else{
      o[min(which(!o[g[i]:N]))+g[i]-1]=1
    }}}

returning an estimated probability of approximately 0.764.

As I had some free time during the early mornings at ISBA 2022, I tried to reformulate the question as a continuous event on uniform order statistics, turning to be at most one uniform larger than (N-1)/N, at most two larger than (N-2)/N, and so on… Asking the question on math.stackexchange quickly produced an answer that reversed engineered my formulation back to the goats (or parking lot), with a generic probability of

\dfrac{(N+1)^{N-1}}{N^N}

which of course coincides with the Monte Carlo approximation!

As an aside, I once drank South-African wines named Goats-do-Roam and Goat-Roti at my friends Jim and Maria’s place,  and they were quite enjoyable!

%d bloggers like this: