Archive for FiveThirtyEight

riddles on a line [#2]

Posted in Books, Kids, R with tags , , , , , , , on September 11, 2018 by xi'an

A second Riddle(r), with a puzzle related with the integer set Ð={,12,3,…,N}, in that it summarises as

Given a random walk on Ð, starting at the middle N/2, with both end states being absorbing states, and a uniform random move left or right of the current value to the (integer) middle of the corresponding (left or right) integer interval, what is the average time to one absorbing state as a function of N?

Once the Markov transition matrix M associated with this random walk is defined, the probability of reaching an absorbing state in t steps can be derived from the successive powers of M by looking at the difference between the probabilities to be (already) absorbed at both t-1 and t steps. From which the average can be derived.

avexit <- function(N=100){
 #transition matrix M for the walk
 #1 and N+2 are trapping states
 tranz=matrix(0,N+2,N+2)
 tranz[1,1]=tranz[N+2,N+2]=1
 for (i in 2:(N+1))
  tranz[i,i+max(trunc((N+1-i)/2),1)]=tranz[i,i-max(trunc((i-2)/2),1)]=1/2
 #probabilities of absorption
 prowin=proloz=as.vector(0)
 init=rep(0,N+2)
 init[trunc((N+1)/2)]=1 #first position
 curt=init
 while(1-prowin[length(prowin)]-proloz[length(prowin)]>1e-10){
  curt=curt%*%tranz
  prowin=c(prowin,curt[1])
  proloz=c(proloz,curt[N+2])}
 #probability of new arrival in trapping state
 probz=diff(prowin+proloz)
 return(sum((2:length(proloz))*probz))}

leading to an almost linear connection between N and expected trapping time.

riddles on a line

Posted in Books, Kids, R with tags , , , , , , , , , on August 22, 2018 by xi'an

In the Riddler of August 18, two riddles connected with the integer set Ð={2,3,…,10}:

  1. Given a permutation of Ð, what is the probability that the most likely variation (up or down) occurs at each term?
  2. Given three players choosing three different integers in Ð sequentially, and rewards in Ð allocated to the closest of the three players (with splits in case of equal distance), what is the reward for each given an optimal strategy?

For the first question, a simple code returns 0.17…

winofail<-function(permz){ 
  if (length(permz)>1){
    lenoperm=length(permz[-1])/2
    win=(permz[1]<permz[2])*(sum(permz[-1]>permz[1])>lenoperm)+
     (permz[1]>permz[2])*(sum(permz[-1]<permz[1])>lenoperm)+
      (runif(1)<.5)*(sum(permz[-1]>permz[1])==lenoperm)
    win=win&&winofail(permz[-1])
  }else win=TRUE
  return(win)}

(but the analytic solution exhibits a cool Pascal triangular relation!) and for the second question, a quick recursion or dynamic programming produces 20, 19, 15 as the rewards (and 5, 9, 8 as the locations)

gainz<-function(seqz){
  difz=t(abs(outer(2:10,seqz,"-")))
  cloz=apply(difz,2,rank)
  return(apply(rbind(2:10,2:10,2:10)*
   ((cloz==1)+.5*(cloz==1.5)),1,sum))}

timeline<-function(prev){ 
  if (length(prev)==0){ 
   sol=timeline(10);bez=gainz(sol)[1] 
   for (i in 2:9){ 
    bol=timeline(i);comp=gainz(bol)[1] 
    if (comp>bez){
        bez=comp;sol=bol}}}
  if (length(prev)==1){
    bez=-10
    for (i in (2:10)[-(prev-1)]){
      bol=timeline(c(prev,i))
      comp=gainz(bol)[2]
      if (comp>bez){
        bez=comp;sol=bol}}}
  if (length(prev)==2){
    bez=-10
    for (i in (2:10)[-(prev-1)]){
      bol=c(prev,i)
      comp=gainz(bol)[3]
      if (comp>bez){
        bez=comp;sol=bol}}}
  return(sol)}

After reading the solution on the Riddler, I realised I had misunderstood the line as starting at 2 when it was actually starting from 1. Correcting for this leads to the same 5, 9, 8 locations of picks, with rewards of 21, 19, 15.

a funny mistake

Posted in Statistics with tags , , , , , , , , , , , on August 20, 2018 by xi'an

While watching the early morning activity in Tofino inlet from my rental desk, I was looking at a recent fivethirthyeight Riddle, which consisted in finding the probability of stopping a coin game which rule was to wait for the n consecutive heads if (n-1) consecutive heads had failed to happen when requested, which is

p+(1-p)p²+(1-p)(1-p²)p³+…

or

q=\sum_{k=1}^\infty p^k \prod_{j=1}^{k-1}(1-p^j)

While the above can write as

q=\sum_{k=1}^\infty \{1-(1-p^k)\} \prod_{j=1}^{k-1}(1-p^j)

or

\sum_{k=1}^\infty \prod_{j=1}^{k-1}(1-p^j)-\prod_{j=1}^{k}(1-p^j)

hence suggesting

q=\sum_{k=1}^\infty \prod_{j=1}^{k-1}(1-p^j) - \sum_{k=2}^\infty \prod_{j=1}^{k-1}(1-p^j) =1

the answer is (obviously) false and the mistake in separating the series into a difference of series is that both terms are infinite. The correct answer is actually

q=1-\prod_{j=1}^{\infty}(1-p^j)

which is Euler’s function. Maybe nonstandard analysis can apply to go directly from the difference of the infinite series to the answer!

a thread to bin them all [puzzle]

Posted in Books, Kids, R, Travel with tags , , , , , , , , on July 9, 2018 by xi'an

The most recent riddle on the Riddler consists in finding the shorter sequence of digits (in 0,1,..,9) such that all 10⁴ numbers between 0 (or 0000) and 9,999 can be found as a group of consecutive four digits. This sequence is obviously longer than 10⁴+3, but how long? On my trip to Brittany last weekend, I wrote an R code first constructing the sequence at random by picking with high preference the next digit among those producing a new four-digit number

tenz=10^(0:3)
wn2dg=function(dz) 1+sum(dz*tenz)

seqz=rep(0,10^4)
snak=wndz=sample(0:9,4,rep=TRUE)
seqz[wn2dg(wndz)]=1
while (min(seqz)==0){
  wndz[1:3]=wndz[-1];wndz[4]=0
  wndz[4]=sample(0:9,1,prob=.01+.99*(seqz[wn2dg(wndz)+0:9]==0))
  snak=c(snak,wndz[4])
  sek=wn2dg(wndz)
  seqz[sek]=seqz[sek]+1}

which usually returns a value above 75,000. I then looked through the sequence to eliminate useless replicas

for (i in sample(4:(length(snak)-5))){
 if ((seqz[wn2dg(snak[(i-3):i])]>1)
  &(seqz[wn2dg(snak[(i-2):(i+1)])]>1)
  &(seqz[wn2dg(snak[(i-1):(i+2)])]>1)
  &(seqz[wn2dg(snak[i:(i+3)])]>1)){
   seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]-1
   seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]-1
   seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]-1
   seqz[wn2dg(snak[i:(i+3)])]=seqz[wn2dg(snak[i:(i+3)])]-1
   snak=snak[-i]
   seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]+1
   seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]+1
   seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]+1}}

until none is found. A first attempt produced 12,911 terms in the sequence. A second one 12,913. A third one 12,871. Rather consistent figures but not concentrated enough to believe in achieving a true minimum. An overnight run produced 12,779 as the lowest value. Checking the answer the week after, it appears that 10⁴+3 is the correct answer!

linear Diophantine equations

Posted in Statistics with tags , , , , , , on May 10, 2018 by xi'an

When re-expressed in maths terms, the current Riddler is about finding a sequence x⁰,x¹,…,x⁷ of integers such that

x⁰=7x¹+1
6x¹=7x²+1

6x⁶=7x⁷+1
6x⁷=7x⁸

which turns into a linear equation with integer valued solutions, or a system of linear Diophantine equation. Which can be easily solved by brute-force R coding:

A=matrix(0,7,7)
for (i in 1:7) A[i,i]=6
for (i in 1:6) A[i,i+1]=-7
for (x in 1:1e6){
  zol=solve(a=A,b=c(rep(1,6),7*x))
  if (max(abs(zol-round(zol)))<1e-3) print(x)}
x=39990 #x8=5.6.31.43
7*solve(a=A,b=c(rep(1,6),7*x))[1]+1 #x0

which produces x⁰=823537. But it would be nicer to directly solve the linear system under the constraint. For instance, the inverse of the matrix A above is an upper triangular matrix with (upper-)diagonals

1/6, 7/6², 7²/6³,…,7⁶/6⁷

but this does not help considerably, except for x⁸ to be solutions to 7 equations involving powers of 6 and 7… This system of equations can be solved by successive substitutions but this still feels very pedestrian!

 

a one-chance meeting [puzzle]

Posted in Books, Kids, pictures, R with tags , , , , , , on March 6, 2018 by xi'an

This afternoon, I took a quick look at the current Riddler puzzle, which sums up as, given three points A, B, C, arbitrarily moving on a plane with a one-shot view of their respective locations, find a moving rule to bring the three together at the same point at the same time. And could not spot the difficulty.

The solution seems indeed obvious when expressed as above rather than in the tell-tale format of the puzzle. Since every triangle has a circumscribed circle, and all points on that circle are obviously at the same distance of the centre O, the three points have to aim at the centre O. Assuming they all move at the same velocity, they will reach O together…

The question gets a wee bit more interesting when the number of points with the same one-time one-shot view option grows beyond 3, as these points will almost surely not all lie on a single circumscribed circle. While getting them together can be done by (a) finding the largest circle going through 3  points and containing all others [in case there is no such circle, adding an artificial point solves the issue!], triplet on which one can repeat the above instructions to reach O, and (b) bringing all points inside the circle to meet with one of the three points [the closest] on its straight-line way to O, by finding a point on that line at equal distance from both, a subsidiary question is whether or not this is the fastest way. Presumably not.  (Again I may be missing one item of the puzzle.)

When experimenting with a short R code, I quickly figured out that the circumscribed circles associated with all triplets do not necessarily contain all points. The resolution of this difficulty is however straightforward as it suffices to add an artificial point by considering all circumcentres and their distances to the farthest point, minimising over these distances and adding the extra point at random over the circumference. As in the example below.Incidentally, it took me much longer to write this post than to solve the puzzle, as I was trying to use the R function draw.circle, which supposedly turns a centre and a radius into the corresponding circle, but somehow misses its target by adapting the curve to the area being displayed. I am still uncertain of what the function means. I hence ended up writing a plain R function for this purpose:

dracirc=function(A,B,C){
  O=findcentr(A,B,C)
  ro=dist(rbind(A,O))
  lines(x=O[1]+ro*sin(2*pi*seq(0,1,le=180)),
  y=O[2]+ro*cos(2*pi*seq(0,1,le=180)))}

cyclic riddle [not in NYC!]

Posted in Kids, R, Running, Travel with tags , , , , , , , , , , on December 29, 2017 by xi'an

In the riddle of this week on fivethirtyeight, the question is to find the average number of rounds when playing the following game: P=6 players sitting in a circle each have B=3 coins and with probability 3⁻¹ they give one coin to their right or left side neighbour, or dump the coin to the centre. The game ends when all coins are in the centre. Coding this experiment in R is rather easy

situz=rep(B,P)
r=1
while (max(situz)>0){
 unz=runif(P)
 newz=situz-(situz>0)
 for (i in (1:P)[unz<1/3]) 
  newz[i%%P+1]=newz[i%%P+1]+(situz[i]>0)
 for (i in (1:P)[unz>2/3])
  newz[i-1+P*(i==1)]=newz[i-1+P*(i==1)]+(situz[i]>0)
 situz=newz
 r=r+1}

resulting in an average of 15.58, but I cannot figure out (quickly enough) an analytic approach to the problem. (The fit above is to a Gamma(â-1,ĝ) distribution.)

In a completely unrelated aparté (aside), I read earlier this week that New York City had prohibited the use of electric bikes. I was unsure of what this meant (prohibited on sidewalks? expressways? metro carriages?) so rechecked and found that electric bikes are simply not allowed for use anywhere in New York City. Because of the potential for “reckless driving”. The target is apparently the high number of delivery e-bikes, but this ban sounds so absurd that I cannot understand it passed. Especially when De Blasio has committed the city to the Paris climate agreement after Trump moronically dumped it… Banning the cars would sound much more in tune with this commitment! (A further aparté is that I strongly dislike e-bikes, running on nuclear power plants,  especially when they pass me on sharp hills!, but they are clearly a lesser evil when compared with mopeds and cars.)