## the strange occurrence of the one bump

Posted in Books, Kids, R, Statistics with tags , , , , , , , , on June 8, 2020 by xi'an

When answering an X validated question on running an accept-reject algorithm for the Gamma distribution by using a mixture of Beta and drifted (bt 1) Exponential distributions, I came across the above glitch in the fit of my 10⁷ simulated sample to the target, apparently displaying a wrong proportion of simulations above (or below) one.

a=.9
g<-function(T){
x=rexp(T)
v=rt(T,1)<0
x=c(1+x[v],exp(-x/a)[!v])
x[runif(T)<x^a/x/exp(x)/((x>1)*exp(1-x)+a*(x<1)*x^a/x)*a]}

It took me a while to spot the issue, namely that the output of

  z=g(T)
while(sum(!!z)<T)z=c(z,g(T))
z[1:T]


was favouring simulations from the drifted exponential by truncating. Permuting the elements of z before returning solved the issue (as shown below for a=½)!

## Le Monde puzzle [#1146]

Posted in Books, Kids, R with tags , , , , , , , , on June 5, 2020 by xi'an

The weekly puzzle from Le Monde is once more disappointing.

Everyday of the month, take 0, 1 or 2 units. If one unit taken past day, next day none can be taken. If two units taken two day ago, none can be taken the current day. What is the strategy maximising the number of units  for February? March? April? Generalise to 0,..,3 units taken each day with 0 compulsory three days after taking 3 units.

as taking 2-1-0 (or 3-2-1-0) sounds like the optimal strategy. Except at the final step when 2, 2-2, 2-2-0, and 1-0-2-2 are best. But then why would one distinguish between the three months..?! Because the outcome differ, as 30, 32, and 33, resp. (or 45, 48 and 51). With an average increase of 1 in the first case and 1.5 in the second.

Another puzzle took too much of my time, namely a code golf challenge to devise a code taking as input a matrix of moves over an n x x grid and returning as input the number of nodes and transient tributaries for each loop (or irreducible set) of the moves. Which I solved by running Markov chains from each starting point long enough to reach stationarity. Entering the moves as

n=3;M=matrix(sample(1:4,n^2,rep=T),n)


and returning the result via 390 bytes

j=cbind;l=sum;a=apply
m=l(!!M);n=m^.5
g=function(A,r=M[A])A+c((r<2)*(1-n*(A[,1]==n))-(r==2)*(1-n*(A[,1]<2)),
(r==3)*(1-n*(A[,2]==n))-(r>3)*(1-n*(A[,2]<2)))
I=c()
for(i in d<-1:n)I=rbind(I,j(i*d/d,d))
for(t in b<-1:m)I=g(I)
p=function(i)i[,1]+n*i[,2]-n-1
K=matrix(0,m,m)
for(t in b)K[b+m*p(I<-g(I))]=1
s=o=a(u<-unique(K),1,l)
for(k in 1:l(!!s))s[k]=l(!a(!!sweep(K,2,u[k,],'-'),1,l))
j(o,s-o)


which could be much shorter if only matrices could be indexed from 0 as in C.

## Le Monde puzzle [#1141]

Posted in Kids, pictures, R, University life with tags , , , , , , , , on May 4, 2020 by xi'an

The weekly puzzle from Le Monde is in honour of John Conway, who just passed away, ending up his own game of life:

On an 8×8 checker-board, Alice picks n squares as “infected”. She then propagates the disease by having each square with least two infected neighbours to become infected as well. What is the minimal value of n for the entire board to become infected? What if three infected neighbours are required?

A plain brute force R random search for proper starting points led to n=8 (with a un-code-golfed fairly ugly rendering of the neighbourhood relation, I am afraid!) with the following initial position

With three neighbours, an similar simulation failed to return anything below n=35 as for instance:

oops, n=34 when running a little longer:

which makes sense since an upper bound is found by filling one square out of two (32) and adding both empty corners (2). But this upper bound is only considering one step ahead, so is presumably way too large. (And indeed the minimal value is 28, showing that brute force does not always work! Or it may be that forcing the number of live cells to grow at each step is a coding mistake…)

## Le Monde puzzle [#1130]

Posted in Books, Kids, R, Statistics with tags , , , , , , , on February 7, 2020 by xi'an

A two-player game as Le weekly Monde current mathematical puzzle:

Abishag and Caleb fill in alternance a row of N boxes in a row by picking one then two then three &tc. consecutive boxes. When a player is unable to find enough consecutive boxes, the player has lost. Who is winning when N=29? When N=30?

Using a basic recursive search for the optimal strategy, with the status of the row and the number of required boxes as entries,

f<-function(b=!1:N,r=0){
for(i in 1:(N-r)){
if(p<-!max(b[j<-i+r:0])){
q=b;q[j]=1
if(p<-!f(q,r+1))break}}
p}


returns Abishag as the winner for N=29 (as well as for N=1,2,7,…,13,19,…,29) and Caleb as the winner for N=30 (as well as for N=3,…,6,14,…,18). I am actually surprised that the recursion operates that deep, even though this means a √N depth for the recursion. While the code took too long to complete, the function operates for N=100. A side phenomenon is the apparent special case of N=47, which makes Abishag the looser, while N=46 and N=48 do not.This is an unusual pattern as otherwise (up to N=59), there are longer and longer stretches of adjacent wins and looses as N increases.

## shortened iterations [code golf]

Posted in Kids, pictures, Statistics, Travel with tags , , , , , , , , on October 29, 2019 by xi'an

A codegolf lazy morning exercise towards finding the sequence of integers that starts with an arbitrary value n and gets updated by blocks of four as

$a_{4k+1} = a_{4k} \cdot(4k+1)\\ a_{4k+2} = a_{4k+1} + (4k+2)\\ a_{4k+3} = a_{4k+2} - (4k+3)\\ a_{4k+4} = a_{4k+3} / (4k+4)$

until the last term is not an integer. While the update can be easily implemented with the appropriate stopping rule, a simple congruence analysis shows that, depending on n, the sequence is 4, 8 or 12 values long when

$n\not\equiv 1(4)\\ n\equiv 1(4)\ \text{and}\ 3(n-1)+4\not\equiv 0(32)\\ 3(n-1)+4\equiv 0(32)$

respectively. But sadly the more interesting fixed length solution

~=rep #redefine function
b=(scan()-1)*c(32~4,8,40~4,1,9~3)/32+c(1,1,3,0~3,6,-c(8,1,9,-71,17)/8)
b[!b%%1] #keep integers only


ends up being longer than the more basic one:

a=scan()
while(!a[T]%%1)a=c(a,d<-a[T]*T,d+T+1,e<-d-1,e/((T<-T+4)-1))
a[-T]


where Robin’s suggestion of using T rather than length is very cool as T has double meaning, first TRUE (and 1) then the length of a…