Archive for R

Le Monde puzzle [#1707]

Posted in Books, Kids, R with tags , , , , , on July 28, 2017 by xi'an

A geometric Le Monde mathematical puzzle:

  1. Given a pizza of diameter 20cm, what is the way to cut it by two perpendicular lines through a point distant 5cm from the centre towards maximising the surface of two opposite slices?
  2.  Using the same point as the tip of the four slices, what is the way to make four slices with equal arcs in four cuts from the tip again towards maximising the surface of two opposite slices?

For both questions, I did not bother with the maths but went itself to a discretisation of the disk, counting the proportion of points within two opposite slices and letting the inclination of these slices move from zero to π/2. Unsurprisingly, for the first question, the answer is π/4, given that there is no difference between both surfaces at angles 0 and π/2. My R code is as follows, using (5,0) as the tip:

M=100
surfaz=function(alpha){
surfz=0
cosal=cos(alpha);sinal=sin(alpha)
X=Y=seq(-10,10,le=M)
Xcosal=(X-5)*cosal
Xsinal=(X-5)*sinal
for (i in 1:M){
norm=sqrt(X[i]^2+Y^2)
scal1=Xsinal[i]+Y*cosal
scal2=-Xcosal[i]+Y*sinal
surfz=surfz+sum((norm<=10)*(scal1*scal2>0))}
return(4*surfz/M/M/pi)}

The second puzzle can be solved by a similar code, except that the slice area between two lines has to be determined by a cross product:

surfoz=function(alpha,ploz=FALSE){
  sinal=sin(alpha);cosal=cos(alpha)
  X=Y=seq(-10,10,le=M)
  frsterm=cosal*(10*cosal-5)+sinal*(10*sinal-5)
  trdterm=cosal*(10*cosal+5)+sinal*(10*sinal+5)
  surfz=0
  for (i in 1:M){
    norm=sqrt(X[i]^2+Y^2)
    scal1=(10*(Y[i]-5)*cosal-(10*sinal-5)*X)*frsterm
    scal2=-(-10*(Y[i]-5)*sinal-(10*cosal-5)*X)*frsterm
    scal3=(-10*(Y[i]-5)*cosal+(10*sinal+5)*X)*trdterm
    scal4=-(10*(Y[i]-5)*sinal+(10*cosal+5)*X)*trdterm
    surfz=surfz+sum((norm<=10)* 
    ((scal1>0)*(scal2>0)+
     (scal3>0)*(scal4>0)))}
 return(4*surfz/M/M/pi)}

a code that shows that all cuts lead to identical surfaces for bot sets of slices. A fairly surprising result!

 

Le Monde puzzle [#1016]

Posted in Books, Kids with tags , , , on July 16, 2017 by xi'an

An even more straightforward Le Monde mathematical puzzle that took a few minutes to code in the train to Cambridge:

  1. Breaking {1,…,8} into two sets of four integrals, what is (or are) the division into two groups of equal size such that the sums of the squared terms from each are equal? Same question for the set {21,…,28}.
  2.  Considering the integers from 1 to 12, how many divisions into two groups of size six satisfy the above property? Same question when the two groups are of different sizes.

The first code is

nop=TRUE
while (nop){
 s=sample(1:8)
 nop=(sum(s[1:4]^2)!=sum(s[5:8]^2))}

with result

1 6 4 7

while the second set leads to the unique [drifted] solution (up to symmetries)

21 24 26 27

and the divisions for the larger set {1,…,12} is unique in the equal case, and are four in the unequal case.

RNG impact on MCMC [or lack thereof]

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , on July 13, 2017 by xi'an

Following the talk at MCM 2017 about the strange impact of the random generator on the outcome of an MCMC generator, I tried in Montréal airport the following code on the banana target of Haario et al. (1999), copied from Soetaert and Laine and using the MCMC function of the FME package:

library(FME)
Banana <- function (x1, x2) {
 return(x2 - (x1^2+1)) }
pmultinorm <- function(vec, mean, Cov) {
 diff <- vec - mean
 ex <- -0.5*t(diff) %*% solve(Cov) %*% diff
 rdet <- sqrt(det(Cov))
 power <- -length(diff)*0.5
 return((2.*pi)^power / rdet * exp(ex)) }
BananaSS <- function (p) {
 P <- c(p[1], Banana(p[1], p[2]))
 Cov <- matrix(nr = 2, data = c(1, 0.9, 0.9, 1))
N=1e3
ejd=matrix(0,4,N)
RNGkind("Mars")
for (t in 1:N){
  MCMC <- modMCMC(f = BananaSS, p = c(0, 0.7), 
  jump = diag(nrow = 2, x = 5), niter = 1e3)
  ejd[1,t]=mean((MCMC$pars[-1,2]-MCMC$pars[1,2])^2)}

since this divergence from the initial condition seemed to reflect the experiment of the speaker at MCM 2017. Unsurprisingly, no difference came from using the different RNGs in R (which may fail to contain those incriminated by the study)…

easy riddle

Posted in Books, Kids, R with tags , , , , , on July 12, 2017 by xi'an

From the current Riddler, a problem that only requires a few lines of code and a few seconds of reasoning. Or not.

N households each stole the earnings from one of the (N-1) other households, one at a time. What is the probability that a given household is not burglarised? And what are the expected final earnings of each household in the list, assuming they all start with $1?

The first question is close to Feller’s enveloppe problem in that

\left(1-\frac{1}{N-1}\right)^{N-1}

is close to exp(-1) for N large. The second question can easily be solved by an R code like

N=1e3;M=1e6
fina=rep(1,N)
for (v in 1:M){
 ordre=sample(1:N)
 vole=sample(1:N,N,rep=TRUE)
 while (min(abs(vole-(1:N)))==0)
  vole[abs(vole-(1:N))==0]=sample(1:N,
     sum(vole-(1:N)==0))
 cash=rep(1,N)
 for (t in 1:N){
  cash[ordre[t]]=cash[ordre[t]]+cash[vole[t]];cash[vole[t]]=0}
 fina=fina+cash[ordre]}

which returns a pretty regular exponential-like curve, although I cannot figure the exact curve beyond the third burglary. The published solution gives the curve

{\frac{N-2}{N-1}}^{999}\times 2+{\frac{1}{N-1}}^{t-1}\times{\frac{N-1}{N}}^{N-t}\times\frac{N}{N-1}

corresponding to the probability of never being robbed (and getting on average an extra unit from the robbery) and of being robbed only before robbing someone else (with average wealth N/(N-1)).

MCM 2017 snapshots [#2]

Posted in Books, pictures, Running, Statistics, University life with tags , , , , , , , , , , , on July 7, 2017 by xi'an

On the second day of MCM 2017, Emmanuel Gobet (from Polytechnique) gave the morning plenary talk on regression Monte Carlo methods, where he presented several ways of estimating conditional means of rv’s in nested problems where conditioning involves other conditional expectations. While interested in such problems in connection with ABC, I could not see how the techniques developed therein could apply to said problems.

By some of random chance, I ended up attending a hard-core random generation session where the speakers were discussing discrepancies between GNU library generators [I could not understand the target of interest and using MCMC till convergence seemed prone to false positives!], and failed statistical tests of some 64-bit Mersenne Twisters, and low discrepancy on-line subsamples of Uniform samples. Most exciting of all, Josef Leydold gave a talk on ratio-of-uniforms, on which I spent some time a while ago  (till ending up reinventing the wheel!), with highly refined cuts of the original box.

My own 180 slides [for a 50mn talk] somewhat worried my chairman, Art Owen, who kindly enquired the day before at the likelihood I could go through all 184 of them!!! I had appended the ABC convergence slides to an earlier set of slides on ABC with random forests in case of questions about that aspect, although I did not plan to go through those slides [and I mostly covered the 64 other slides] As the talk was in fine more about an inference method than a genuine Monte Carlo technique, plus involved random forests that sounded unfamiliar to many, I did not get many questions from the audience but had several deep discussions with people after the talk. Incidentally, we have just reposted our paper on ABC estimation via random forests, updated the abcrf R package, and submitted it to Peer Community in Evolutionary Biology!

Le Monde puzzle [#1013]

Posted in Books, Kids with tags , , , , , on June 23, 2017 by xi'an

A purely arithmetic Le Monde mathematical puzzle:

An operation þ applies to all pairs of natural integers with the properties

0 þ (a+1) = (0 þ a)+1, (a+1) þ (b+1)=(a þ b)+1, 271 þ 287 = 77777, 2018 þ 39 = 2018×39

Find the smallest integer d>287 such that there exists c<d leading to c þ d = c x d, the smallest integer f>2017 such that 2017 þ f = 2017×40. Is there any know integer f such that f þ 2017 = 40×2017?

The major appeal in this puzzle (where no R programming seems to help!) is that the “data” does not completely defines the operation  þ ! Indeed, when a<b, it is straightforward to deduce that a þ b = (0 þ 0)+b, hence solving the first two questions by deriving (0 þ 0)=270×287 [with d=315 and f=2017×40-270×287], but the opposed quantity b þ a is not defined, apart from (2018-39) þ 0. This however brings a resolution since

(2018-39) þ 0 = 2017×39 and (2018-39+2017) þ 2017 = 2017×39+2017 = 2017×40

leading to f=2018-39+2017=3996.

non-Bayesian decision riddle

Posted in Statistics with tags , , , on June 22, 2017 by xi'an

As a continuation of the Bayesian resolution of last week riddle, I looked at a numeric resolution of the four secretaries problem, while in the train back from Rouen (and trying to block the chatter of my neighbours, a nuisance I find myself more and more sensitive to!). The target function is defined as

gainz=function(b,c,T=1e4,type="raw"){
  x=matrix(runif(4*T),ncol=4)
  maz=t(apply(x,1,cummax))
  zam=t(apply(x[,4:1],1,cummax))
  if (type=="raw"){return(mean(
   ((x[,2]>b*x[,1])*x[,2]+
    (x[,2]<b*x[,1])*((x[,3]>c*maz[,2])*x[,3]+
        (x[,3]<c*maz[,2])*x[,4]))/maz[,4]))} 
  if (type=="global"){return(mean( 
    ((x[,2]>b*x[,1])*(x[,2]==maz[,4])+
     (x[,2]<b*x[,1])*((x[,3]>c*maz[,2])*(x[,3]==maz[,4])+
         (x[,3]<c*maz[,2])*(x[,4]==maz[,4])))))} 
  if (type=="remain"){return(mean( 
    ((x[,2]>b*x[,1])*(x[,2]==zam[,3])+
     (x[,2]<b*x[,1])*((x[,3]>c*maz[,2])*(x[,3]==zam[,2])+
          (x[,3]<c*maz[,2])*(x[,4]==zam[,2])))))}}

where the data is generated from a U(0,1) distribution as the loss functions are made scaled free by deciding to always sacrifice the first draw, x¹. This function is to be optimised in (b,c) and hence I used a plain vanilla simulated annealing R code:

avemale=function(T=3e4,type){
  b=c=.5
  maxtar=targe=gainz(b,c,T=1e4,type)
  temp=0.1
  for (t in 1:T){
    bp=b+runif(1,-temp,temp)
    cp=c+runif(1,-temp,temp)
    parge=(bp>0)*(cp>0)*gainz(bp,cp,T=1e4,type)
    if (parge>maxtar){
      b=bs=bp;c=cs=cp;maxtar=targe=parge}else{
    if (runif(1)<exp((parge-targe)/temp)){
      b=bp;c=cp;targe=parge}}
    temp=.9999*temp}
  return(list(bs=bs,cs=cs,max=maxtar))}

with outcomes

  • b=1, c=.5, and optimum 0.8 for the raw type
  • b=c=1 and optimum 0.45 for the global type
  • b undefined, c=2/3 and optimum 0.75 for the remain type