continental divide

Posted in Books, Kids, pictures, R with tags , , , , on May 19, 2017 by xi'an

While the Riddler puzzle this week was anticlimactic,  as it meant filling all digits in the above division towards a null remainder, it came as an interesting illustration of how different division is taught in the US versus France: when I saw the picture above, I had to go and check an American primary school on-line introduction to division, since the way I was taught in France is something like that

with the solution being that 12128316 = 124 x 97809… Solved by a dumb R exploration of all constraints:

```for (y in 111:143)
for (z4 in 8:9)
for (oz in 0:999){
z=oz+7e3+z4*1e4
x=y*z
digx=digits(x)
digz=digits(z)
if ((digz[2]==0)&(x>=1e7)&(x<1e8)){
r1=trunc(x/1e4)-digz[5]*y
if ((digz[5]*y>=1e3)&(digz[4]*y<1e4) &(r1>9)&(r1<100)){
r2=10*r1+digx[4]-7*y
if ((7*y>=1e2)&(7*y<1e3)&(r2>=1e2)&(r2<1e3)){
r3=10*r2+digx[3]-digz[3]*y
if ((digz[3]*y>=1e2)&(digz[3]*y<1e3)&(r3>9)&(r3<1e2)){
r4=10*r3+digx[2]
if (r4<y) solz=rbind(solz,c(y,z,x))
}}}}
```

Looking for a computer-free resolution, the constraints on z exhibited by the picture are that (a) the second digit is 0 and the fourth digit is 7.  Moreover, the first and fifth digits are larger than 7 since y times these digits is a four-digit number. Better, since the second subtraction from a three-digit number by 7y returns a three-digit number and the third subtraction from a four-digit number by ny returns a two-digit number, n is larger than 7 but less than the first and fifth digits. Ergo, z is necessarily 97809! Furthermore, 8y<10³ and 9y≥10³, which means 111<y<125. Plus the constraint that 1000-8y≤99 implies y≥112. Nothing gained there! This leaves 12 values of y to study, unless there is another restriction I missed…

a secretary problem with maximum ability

Posted in Kids, R with tags , , , , on April 28, 2017 by xi'an

The Riddler of today has a secretary problem, where one measures sequentially N random variables until one deems the current variable to be the largest of the whole sample. The classical secretary problem has a counter-intuitive solution where one first measures N/e random variables without taking any decision and then and only then picks the first next outcome larger than the largest in the first group. (For large values of N.) The added information in the current riddle is that the distribution of those iid random variables is set to be uniform on {1,…,M}, which begs for a modification in the algorithm. As for instance when observing M on the current draw.

The approach I devised is certainly suboptimal, as I decided to pick the currently observed value if the (conditional) probability it is the largest is larger than the probability subsequent draws. This translates into the following R code:

```M=100 #maximum value
N=10  #total number of draws
hyprob=function(m){
# m is sequence of draws so far
n=length(m);mmax=max(m)
if ((m[n]<mmax)||(mmax-n<N-n)){prob=0
}else{
prob=prod(sort((1:mmax)[-m],dec=TRUE)
[1:(N-n)]/((M-n):(M-N+1))}
return(prob)}

decision=function(draz=sample(1:M,N)){
i=0
keepgoin=TRUE
while ((keepgoin)&(i<N)){
i=i+1
keepgoin=(hyprob(draz[1:i])<0.5)}
return(c(i,draz[i],(draz[i]<max(draz))))}
```

which produces a winning rate of around 62% when N=10 and M=100, hence much better than the expected performances of the (asymptotic) secretary algorithm, with a winning frequency of 1/e. (For N=10 and M=100, the winning frequency is only 27%.)

optimultiplication [a riddle]

Posted in Books, Kids, R, Statistics with tags , , , , , on April 14, 2017 by xi'an

The riddle of this week is about an optimisation of positioning the four digits of a multiplication of two numbers with two digits each and is open to a coding resolution:

Four digits are drawn without replacement from {0,1,…,9}, one at a time. What is the optimal strategy to position those four digits, two digits per row, as they are drawn, toward minimising the average product?

Although the problem can be solved algebraically by computing E[X⁴|x¹,..] and E[X⁴X³|x¹,..]  I wrote three R codes to “optimise” the location of the first three digits: the first digit ends up as a unit if it is 5 or more and a multiple of ten otherwise, on the first row. For the second draw, it is slightly more variable: with this R code,

```second<-function(i,j,N=1e5){draw
drew=matrix(0,N,2)
for (t in 1:N)
drew[t,]=sample((0:9)[-c(i+1,j+1)],2)
conmean=(45-i-j)/8
conprod=mean(drew[,1]*drew[,2])
if (i<5){ #10*i
pos=c((110*i+11*j)*conmean,
100*i*j+10*(i+j)*conmean+conprod,
(100*i+j)*conmean+10*i*j+10*conprod)}else{
pos=c((110*j+11*i)*conmean,
10*i*j+(100*j+i)*conmean+10*conprod,
10*(i+j)*conmean+i*j+100*conprod)
return(order(pos)[1])}
```

the resulting digit again ends up as a unit if it is 5 (except when x¹=7,8,9, where it is 4) or more and a multiple of ten otherwise, but on the second row. Except when x¹=0, x²=1,2,3,4, when they end up on the first row together, 0 obviously in front.

For the third and last open draw, there is only one remaining random draw, which mean that the decision only depends on x¹,x²,x³ and E[X⁴|x¹,x²,x³]=(45-x¹-x²-x³)/7. Attaching x³ to x² or x¹ will then vary monotonically in x³, depending on whether x¹>x² or x¹<x²:

```fourth=function(i,j,k){
comean=(45-i-j-k)/7
if ((i<1)&(j<5)){ pos=c(10*comean+k,comean+10*k)}
if ((i<5)&(j>4)){ pos=c(100*i*comean+k*j,j*comean+100*i*k)}
if ((i>0)&(i<5)&(j<5)){ pos=c(i*comean+k*j,j*comean+i*k)}
if ((i<7)&(i>4)&(j<5)){ pos=c(i*comean+100*k*j,j*comean+100*i*k)}
if ((i<7)&(i>4)&(j>4)){ pos=c(i*comean+k*j,j*comean+i*k)}
if ((i>6)&(j<4)){ pos=c(i*comean+100*k*j,j*comean+100*i*k)}
if ((i>6)&(j>3)){ pos=c(i*comean+k*j,j*comean+i*k)}
return(order(pos)[1])}```

Running this R code for all combinations of x¹,x² shows that, except for the cases x¹≥5 and x²=0, for which x³ invariably remains in front of x¹, there are always values of x³ associated with each position.

how large is 9!!!!!!!!!?

Posted in Statistics with tags , , , , , , , , , on March 17, 2017 by xi'an

This may sound like an absurd question [and in some sense it is!], but this came out of a recent mathematical riddle on The Riddler, asking for the largest number one could write with ten symbols. The difficulty with this riddle is the definition of a symbol, as the collection of available symbols is a very relative concept. For instance, if one takes  the symbols available on a basic pocket calculator, besides the 10 digits and the decimal point, there should be the four basic operations plus square root and square,which means that presumably 999999999² is the largest one can  on a cell phone, there are already many more operations, for instance my phone includes the factorial operator and hence 9!!!!!!!!! is a good guess. While moving to a computer the problem becomes somewhat meaningless, both because there are very few software that handle infinite precision computing and hence very large numbers are not achievable without additional coding, and because it very much depends on the environment and on which numbers and symbols are already coded in the local language. As illustrated by this X validated answer, this link from The Riddler, and the xkcd entry below. (The solution provided by The Riddler itself is not particularly relevant as it relies on a particular collection of symbols, which mean Rado’s number BB(9999!) is also a solution within the right referential.)

a riddle at the end of its tether

Posted in R, Statistics with tags , , on February 24, 2017 by xi'an

A simply worded riddle this week on The Riddler, about four ropes having non-uniform and unknown burning rates, the only constraint being they all burn completely in one hour. With the help of a lighter (or even a single match), what are the possible units of time one can measure by burning them?

While I had worked out a range of possible times for each of the ropes to complete its combustion, I went for a simulation based confirmation. The starting point is that a new fire can only be started when a burning rope ends up burning. Which only happens for a finite number of possibilities. I found 24 of them, consisting of

```> total*prec
[1]  0.000 0.5000 0.750 0.875 0.9375 1.000 1.125 1.1875 1.25 1.3125
[11] 1.375 1.4375 1.500 1.625 1.7500 1.875 2.000 2.1250 2.25 2.5000
[21] 2.750 3.0000 3.500 4.000```

i.e., some combinations of 1, 2⁻¹, …, 2⁻⁴, with the comment that those times cannot all be obtained within a single arson experiment.

a knapsack riddle?

Posted in Books, pictures, R, Statistics, Travel with tags , , , , , , on February 13, 2017 by xi'an

The [then current now past] riddle of the week is a sort of multiarmed bandits optimisation. Of sorts. Or rather a generalised knapsack problem. The question is about optimising the allocation of 100 undistinguishable units to 10 distinct boxes against a similarly endowed adversary, when the loss function is

$L(x,y)=(x_1>y_1)-(x_1y_{10})-(x_{10}

and the distribution q of the adversary is unknown. As usual (!), the phrasing of the riddle is somewhat ambiguous but I am under the impression that the game is played sequentially, hence that one can learn about the distribution of the adversary, at least when assuming this adversary keeps the same distribution q at all times. Continue reading

another Galton-Watson riddle

Posted in Statistics with tags , , , on February 3, 2017 by xi'an

The riddle on the Riddler this week is definitely a classic, since it rephrases the standard Galton-Watson branching process (which should have been called Bienaymé‘s process, as he established the relation before Watson, while the jack-of-all-trades Francis Galton only posed the question):

At the beginning, there is a single microorganism. Each day, every member of this species either splits into two copies of itself or dies. If the probability of multiplication is p, what are the chances that this species goes extinct?

As is easily seen from the moment generating function, the species goes instinct if p≤½. Actually, I always found it intriguing [intuitively] that the value ½ is included in the exclusion range!