Archive for the R Category

splitting a field by annealing

Posted in Kids, pictures, R, Statistics with tags , , , , , , , , on October 18, 2017 by xi'an

A recent riddle [from The Riddle] that I pondered about during a [long!] drive to Luxembourg last weekend was about splitting a square field into three lots of identical surface for a minimal length of separating wire… While this led me to conclude that the best solution was a T like separation, I ran a simulated annealing R code on my train trip to AutransValence, seemingly in agreement with this conclusion.I discretised the square into n² units and explored configurations by switching two units with different colours, according to a simulated annealing pattern (although unable to impose connectivity on the three regions!):

partz=matrix(1,n,n)
partz[,1:(n/3)]=2;partz[((n/2)+1):n,((n/3)+1):n]=3
#counting adjacent units of same colour 
nood=hood=matrix(4,n,n)
for (v in 1:n2) hood[v]=bourz(v,partz)
minz=el=sum(4-hood)
for (t in 1:T){
  colz=sample(1:3,2) #picks colours
  a=sample((1:n2)[(partz==colz[1])&(hood<4)],1)
  b=sample((1:n2)[(partz==colz[2])&(hood<4)],1) 
  partt=partz;partt[b]=colz[1];partt[a]=colz[2] 
#collection of squares impacted by switch 
  nood=hood 
  voiz=unique(c(a,a-1,a+1,a+n,a-n,b-1,b,b+1,b+n,b-n)) 
  voiz=voiz[(voiz>0)&(voiz<n2)] 
  for (v in voiz) nood[v]=bourz(v,partt) 
  if (nood[a]*nood[b]>0){
    difz=sum(nood)-sum(hood)
    if (log(runif(1))<difz^3/(n^3)*(1+log(10*rep*t)^3)){
      el=el-difz;partz=partt;hood=nood     
      if (el<minz){ minz=el;cool=partz}
  }}}

(where bourz computes the number of neighbours), which produces completely random patterns at high temperatures (low t) and which returns to the T configuration (more or less):if not always, as shown below:Once the (a?) solution was posted on The Riddler, it appeared that one triangular (Y) version proved better than the T one [if not started from corners], with a gain of 3% and that a curved separation was even better with an extra gain less than 1% [solution that I find quite surprising as straight lines should improve upon curved ones…]

Astrostatistics school

Posted in Mountains, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , on October 17, 2017 by xi'an

What a wonderful week at the Astrostat [Indian] summer school in Autrans! The setting was superb, on the high Vercors plateau overlooking both Grenoble [north] and Valence [west], with the colours of the Fall at their brightest on the foliage of the forests rising on both sides of the valley and a perfect green on the fields at the centre, with sun all along, sharp mornings and warm afternoons worthy of a late Indian summer, too many running trails [turning into X country ski trails in the Winter] to contemplate for a single week [even with three hours of running over two days], many climbing sites on the numerous chalk cliffs all around [but a single afternoon for that, more later in another post!]. And of course a group of participants eager to learn about Bayesian methodology and computational algorithms, from diverse [astronomy, cosmology and more] backgrounds, trainings and countries. I was surprised at the dedication of the participants travelling all the way from Chile, Péru, and Hong Kong for the sole purpose of attending the school. David van Dyk gave the first part of the school on Bayesian concepts and MCMC methods, Roberto Trotta the second part on Bayesian model choice and hierarchical models, and myself a third part on, surprise, surprise!, approximate Bayesian computation. Plus practicals on R.

As it happens Roberto had to cancel his participation and I turned for a session into Christian Roberto, presenting his slides in the most objective possible fashion!, as a significant part covered nested sampling and Savage-Dickey ratios, not exactly my favourites for estimating constants. David joked that he was considering postponing his flight to see me talk about these, but I hope I refrained from engaging into controversy and criticisms… If anything because this was not of interest for the participants. Indeed when I started presenting ABC through what I thought was a pedestrian example, namely Rasmus Baath’s socks, I found that the main concern was not running an MCMC sampler or a substitute ABC algorithm but rather an healthy questioning of the construction of the informative prior in that artificial setting, which made me quite glad I had planned to cover this example rather than an advanced model [as, e.g., one of those covered in the packages abc, abctools, or abcrf]. Because it generated those questions about the prior [why a Negative Binomial? why these hyperparameters? &tc.] and showed how programming ABC turned into a difficult exercise even in this toy setting. And while I wanted to give my usual warning about ABC model choice and argue for random forests as a summary selection tool, I feel I should have focussed instead on another example, as this exercise brings out so clearly the conceptual difficulties with what is taught. Making me quite sorry I had to leave one day earlier. [As did missing an extra run!] Coming back by train through the sunny and grape-covered slopes of Burgundy hills was an extra reward [and no one in the train commented about the local cheese travelling in my bag!]

 

[summer Astrostat school] room with a view [jatp]

Posted in Mountains, pictures, R, Running, Statistics, Travel, University life with tags , , , , , , , , , , on October 9, 2017 by xi'an

I just arrived in Autrans, on the Plateau du Vercors overlooking Grenoble and the view is fabulistic! Trees have started to turn red and yellow, the weather is very mild, and my duties are restricted to teaching ABC to a group of enthusiastic astronomers and cosmologists..! Second advanced course on ABC in the mountains this year, hard to beat (except by a third course). The surroundings are so serene and peaceful that I even conceded to install RStudio for my course! Instead of sticking to my favourite vim editor and line commands.

mea culpa!

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , on October 9, 2017 by xi'an

An entry about our Bayesian Essentials book on X validated alerted me to a typo in the derivation of the Gaussian posterior..! When deriving the posterior (which was left as an exercise in the Bayesian Core), I just forgot the term expressing the divergence between the prior mean and the sample mean. Mea culpa!!!

Le Monde puzzle [#1021]

Posted in Books, Kids, R with tags , , , , , on September 17, 2017 by xi'an

A puzzling Le Monde mathematical puzzle for which I could find no answer in the allotted time!:

A most democratic electoral system allows every voter to have at least one representative by having each of the N voters picking exactly m candidates among the M running candidates and setting the size n of the representative council towards this goal, prior to the votes. If there are M=25 candidates, m=10 choices made by the voters, and n=10 representatives, what is the maximal possible value of N? And if N=55,555 and M=33, what is the minimum value of n for which m=n is always possible?

I tried a brute force approach by simulating votes from N voters at random and attempting to find the minimal number of councillors for this vote, which only provides an upper bound of the minimum [for one vote], and a lower bound in the end [over all votes]. Something like

for (i in 1:N) votz[i,]=sample(1:M,n)
#exploration by majority
  remz=1:N;conz=NULL
  while (length(remz)>0){
    seatz=order(-hist(votz[remz,],
    breaks=(0:M)+0.5,plot=FALSE)$density)[1]
    conz=c(conz,seatz);nuremz=NULL
    for (v in remz)
      if (!(seatz%in%votz[v,])) nuremz=c(nuremz,v)
    remz=nuremz}
  solz=length(conz)
#exploration at random
   kandz=matrix(0,N,M)
   for (i in 1:N) kandz[i,votz[i,]]=1
   for (t in 1:1e3){
#random choice of councillors
    zz=sample(c(0,1),M,rep=TRUE)
    while (min(kandz%*%zz)!=1)
      zz=sample(c(0,1),M,rep=TRUE)
    solz=min(solz,sum(zz))
#random choice of remaining councillor per voter
    remz=1:N;conz=NULL
    while (length(remz)>0){
      seatz=sample(votz[remz[1],],1)
      conz=c(conz,seatz);nuremz=NULL
      for (i in remz)
        if (!(seatz%in%votz[i,])) nuremz=c(nuremz,i)
      remz=nuremz}
    solz=min(solz,length(conz))}
maxz=max(solz,maxz)}

which leads to a value near N=4050 for the first question, with 0% confidence… Obviously, the problem can be rephrased as a binary integer linear programming problem of the form

n= \max_A \min_{c;\,\min Ac=1}\mathbf{1}^\text{T}c

where A is the NxM matrix of votes and c is the vector of selected councillors. But I do not see a quick way to fix it!

Le Monde puzzle [#1020]

Posted in Books, Kids, R with tags , , , on September 15, 2017 by xi'an

A collection of liars in this Le Monde mathematical puzzle:

  1. A circle of 16 liars and truth-tellers is such that everyone states that their immediate neighbours are both liars. How many liars can there be?
  2. A circle of 12 liars and truth-tellers is such that everyone state that their immediate neighbours are one liar plus one truth-teller. How many liars can there be?
  3.  A circle of 8 liars and truth-tellers is such that four state that their immediate neighbours are one liar plus one truth-teller and four state that their immediate neighbours are both liars . How many liars can there be?

These questions can easily be solved by brute force simulation. For the first setting, using 1 to code truth-tellers and -1 liars, I simulate acceptable configurations as

tabz=rep(0,16)
tabz[1]=1 #at least one
tabz[2]=tabz[16]=-1
for (i in 3:15){
  if (tabz[i-1]==1){
   tabz[i]=-1}else{
   if (tabz[i+1]==-1){
    tabz[i]=1}else{
    if (tabz[i+1]==1){
     tabz[i]=-1}else{
     if (tabz[i-2]==-1){
      tabz[i]=1}else{
       tabz[i]=sample(c(-1,1),1)
}}}}}

which produces 8, 9, and 10 as possible (and obvious) values.

The second puzzle is associated with the similar R code

tabz=sample(c(-1,1),12,rep=TRUE)
rong=FALSE
while (!rong){
 for (i in sample(12)){
  if (tabz[i-1+12*(i==1)]*tabz[i%%12+1]==-1){
   tabz[i]=1}else{ 
   tabz[i]=sample(c(-1,1),1)}
  }
  rong=TRUE
  for (i in (1:12)[tabz==1])
    rong=rong&(tabz[i-1+12*(i==1)]*tabz[i%%12+1]==-1)
  if (rong){
   for (i in (1:12)[tabz==-1])
     rong=rong&(tabz[i-1+12*(i==1)]*tabz[i%%12+1]!=-1)
   }}

with numbers of liars (-1) either 12 (obvious) or 4.

The final puzzle is more puzzling in that figuring out the validating function (is an allocation correct?) took me a while, the ride back home plus some. I ended up with the following code that samples liars (-1) and thruth-seekers (1) at random, plus forces wrong and right answers (in 0,1,2) on these, and check for the number of answers of both types:

rong=FALSE
while (!rong){
 tabz=sample(c(-1,1),8,rep=TRUE) #truth
 tabz[1]=1;tabz[sample(2:8,1)]=-1
 tt=(1:8)[tabz==1];lr=(1:8)[tabz==-1]
 statz=rep(0,8) #stmt
 statz[tt]=(tabz[tt-1+8*(tt==1)]*tabz[tt%%8+1]==-1)+
           2*(tabz[tt-1+8*(tt==1)]+tabz[tt%%8+1]==-2)
 #answering 0 never works
 statz[lr]=2*(tabz[lr-1+8*(lr==1)]*tabz[lr%%8+1]==-1)+
          (tabz[lr-1+8*(lr==1)]+tabz[lr%%8+1]==-1)+
           sample(c(1,2),8,rep=TRUE)[lr]*
           (tabz[lr-1+8*(lr==1)]+tabz[lr%%8+1]==1)
 rong=(sum(statz==1)==4)&(sum(statz==2)==4)}

with solutions 3, 4, 5 and 6.

Le Monde puzzle [#1018]

Posted in Books, Kids, R with tags , , , , , on August 29, 2017 by xi'an

An arithmetic Le Monde mathematical puzzle (that first did not seem to involve R programming because of the large number of digits in the quantity involved):

An integer x with less than 100 digits is such that adding the digit 1 on both sides of x produces the integer 99x.  What are the last nine digits of x? And what are the possible numbers of digits of x?

The integer x satisfies the identity

10^{\omega+2}+10x+1=99x

where ω is the number of digits of x. This amounts to

10….01 = 89 x,

where there are ω zeros. Working with long integers in R could bring an immediate solution, but I went for a pedestrian version, handling each digit at a time and starting from the final one which is necessarily 9:

#multiply by 9
rap=0;row=NULL
for (i in length(x):1){
prud=rap+x[i]*9
row=c(prud%%10,row)
rap=prud%/%10}
row=c(rap,row)
#multiply by 80
rep=raw=0
for (i in length(x):1){
prud=rep+x[i]*8
raw=c(prud%%10,raw)
rep=prud%/%10}
#find next digit
y=(row[1]+raw[1]+(length(x)>1))%%10

returning

7 9 7 7 5 2 8 0 9

as the (only) last digits of x. The same code can be exploited to check that the complete multiplication produces a number of the form 10….01, hence to deduce that the length of x is either 21 or 65, with solutions

[1] 1 1 2 3 5 9 5 5 0 5 6 1 7 9 7 7 5 2 8 0 9
[1] 1 1 2 3 5 9 5 5 0 5 6 1 7 9 7 7 5 2 8 0 8 9 8 8 7 6 4 0 4 4 9 4 3 8 2 0 2 2
[39] 4 7 1 9 1 0 1 1 2 3 5 9 5 5 0 5 6 1 7 9 7 7 5 2 8 0 9

The maths question behind is to figure out the powers k of 10 such that

10^k\equiv -1 \text{ mod } (89)

For instance, 10²≡11 mod (89) and 11¹¹≡88 mod (89) leads to the first solution ω=21. And then, since 10⁴⁴≡1 mod (89), ω=21+44=65 is another solution…