Archive for the R Category

Le Monde puzzle [#1063]

Posted in Books, Kids, R with tags , , , , , , on August 9, 2018 by xi'an

lemondapariA simple (summertime?!) arithmetic Le Monde mathematical puzzle

  1. A “powerful integer” is such that all its prime divisors are at least with multiplicity 2. Are there two powerful integers in a row, i.e. such that both n and n+1 are powerful?
  2.  Are there odd integers n such that n² – 1 is a powerful integer ?

The first question can be solved by brute force.  Here is a R code that leads to the solution:

isperfz <- function(n){ 
  divz=primeFactors(n) 
  facz=unique(divz) 
  ordz=rep(0,length(facz)) 
  for (i in 1:length(facz)) 
    ordz[i]=sum(divz==facz[i]) 
  return(min(ordz)>1)}

lesperf=NULL
for (t in 4:1e5)
if (isperfz(t)) lesperf=c(lesperf,t)
twinz=lesperf[diff(lesperf)==1]

with solutions 8, 288, 675, 9800, 12167.

The second puzzle means rerunning the code only on integers n²-1…

[1] 8
[1] 288
[1] 675
[1] 9800
[1] 235224
[1] 332928
[1] 1825200
[1] 11309768

except that I cannot exceed n²=10⁸. (The Le Monde puzzles will now stop for a month, just like about everything in France!, and then a new challenge will take place. Stay tuned.)

Le Monde puzzle [#1062]

Posted in Books, Kids, pictures, R with tags , , , , , on July 28, 2018 by xi'an

lemondapariA simple Le Monde mathematical puzzle none too geometric:

  1. Find square triangles which sides are all integers and which surface is its perimeter.
  2. Extend to non-square rectangles.

No visible difficulty by virtue of Pythagore’s formula:

for (a in 1:1e4)
for (b in a:1e4)
  if (a*b==2*(a+b+round(sqrt(a*a+b*b)))) print(c(a,b))

produces two answers

 5 12
 6  8

and in the more general case, Heron’s formula to the rescue!,

for (a in 1:1e2)
for (b in a:1e2)
for (z in b:1e2){
  s=(a+b+z)/2
  if (abs(4*s-abs((s-a)*(s-b)*(s-z)))<1e-4) print(c(a,b,z))}

returns

 4 15 21
 5  9 16
 5 12 13
 6  7 15
 6  8 10
 6 25 29
 7 15 20
 9 10 17

Le Monde puzzle [#1061]

Posted in Books, Kids, R with tags , , , , , on July 20, 2018 by xi'an

lemondapariA griddy Le Monde mathematical puzzle:

  1. On a 4×5 regular grid, find how many nodes need to be turned on to see all 3×4 squares to have at least one active corner in case of one arbitrary node failing.
  2.  Repeat for a 7×9 grid.

The question is open to simulated annealing, as in the following R code:

n=3;m=4;np=n+1;mp=m+1

cvr=function(grue){
  grud=grue
  obj=(max(grue)==0)
  for (i in (1:length(grue))[grue==1]){
   grud[i]=0
   obj=max(obj,max((1-grud[-1,-1])*(1-grud[-np,-mp])*
       (1-grud[-np,-1])*(1-grud[-1,-mp])))
   grud[i]=1}
  obj=99*obj+sum(grue)
  return(obj)}

dumban=function(grid,T=1e3,temp=1,beta=.99){
   obj=bez=cvr(grid)
   sprk=grid
   for (t in 1:T){
     grue=grid
     if (max(grue)==1){ grue[sample(rep((1:length(grid))[grid==1],2),1)]=0
      }else{ grue[sample(1:(np*mp),np+mp)]=1}
     jbo=cvr(grue)
     if (bez>jbo){ bez=jbo;sprk=grue}
     if (log(runif(1))<(obj-jbo)/temp){
        grid=grue;obj=cvr(grid)}
     temp=temp*beta
     }
   return(list(top=bez,sol=sprk))}

leading to

>  dumban(grid,T=1e6,temp=100,beta=.9999)
$top
[1] 8
$sol
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    1    0    1    0
[2,]    0    1    0    1    0
[3,]    0    1    0    1    0
[4,]    0    1    0    1    0

which sounds like a potential winner.

Hamiltonian tails

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , on July 17, 2018 by xi'an

“We demonstrate HMC’s sensitivity to these parameters by sampling from a bivariate Gaussian with correlation coefficient 0.99. We consider three settings (ε,L) = {(0.16; 40); (0.16; 50); (0.15; 50)}” Ziyu Wang, Shakir Mohamed, and Nando De Freitas. 2013

In an experiment with my PhD student Changye Wu (who wrote all R codes used below), we looked back at a strange feature in an 2013 ICML paper by Wang, Mohamed, and De Freitas. Namely, a rather poor performance of an Hamiltonian Monte Carlo (leapfrog) algorithm on a two-dimensional strongly correlated Gaussian target, for very specific values of the parameters (ε,L) of the algorithm.

The Gaussian target associated with this sample stands right in the middle of the two clouds, as identified by Wang et al. And the leapfrog integration path for (ε,L)=(0.15,50)

keeps jumping between the two ridges (or tails) , with no stop in the middle. Changing ever so slightly (ε,L) to (ε,L)=(0.16,40) does not modify the path very much

but the HMC output is quite different since the cloud then sits right on top of the target

with no clear explanation except for a sort of periodicity in the leapfrog sequence associated with the velocity generated at the start of the code. Looking at the Hamiltonian values for (ε,L)=(0.15,50)

and for (ε,L)=(0.16,40)

does not help, except to point at a sequence located far in the tails of this Hamiltonian, surprisingly varying when supposed to be constant. At first, we thought the large value of ε was to blame but much smaller values still return poor convergence performances. As below for (ε,L)=(0.01,450)

a thread to bin them all [puzzle]

Posted in Books, Kids, R, Travel with tags , , , , , , , , on July 9, 2018 by xi'an

The most recent riddle on the Riddler consists in finding the shorter sequence of digits (in 0,1,..,9) such that all 10⁴ numbers between 0 (or 0000) and 9,999 can be found as a group of consecutive four digits. This sequence is obviously longer than 10⁴+3, but how long? On my trip to Brittany last weekend, I wrote an R code first constructing the sequence at random by picking with high preference the next digit among those producing a new four-digit number

tenz=10^(0:3)
wn2dg=function(dz) 1+sum(dz*tenz)

seqz=rep(0,10^4)
snak=wndz=sample(0:9,4,rep=TRUE)
seqz[wn2dg(wndz)]=1
while (min(seqz)==0){
  wndz[1:3]=wndz[-1];wndz[4]=0
  wndz[4]=sample(0:9,1,prob=.01+.99*(seqz[wn2dg(wndz)+0:9]==0))
  snak=c(snak,wndz[4])
  sek=wn2dg(wndz)
  seqz[sek]=seqz[sek]+1}

which usually returns a value above 75,000. I then looked through the sequence to eliminate useless replicas

for (i in sample(4:(length(snak)-5))){
 if ((seqz[wn2dg(snak[(i-3):i])]>1)
  &(seqz[wn2dg(snak[(i-2):(i+1)])]>1)
  &(seqz[wn2dg(snak[(i-1):(i+2)])]>1)
  &(seqz[wn2dg(snak[i:(i+3)])]>1)){
   seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]-1
   seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]-1
   seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]-1
   seqz[wn2dg(snak[i:(i+3)])]=seqz[wn2dg(snak[i:(i+3)])]-1
   snak=snak[-i]
   seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]+1
   seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]+1
   seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]+1}}

until none is found. A first attempt produced 12,911 terms in the sequence. A second one 12,913. A third one 12,871. Rather consistent figures but not concentrated enough to believe in achieving a true minimum. An overnight run produced 12,779 as the lowest value. Checking the answer the week after, it appears that 10⁴+3 is the correct answer!

seven digit addition

Posted in Kids, R with tags , , , on July 6, 2018 by xi'an

Another quick riddle from the riddler: solve the equation

EXMREEK + EHKREKK = ?K?H?X?E

which involves every digit between 0 and 9. While the puzzle can be unravelled by considering first E and K, which must be equal to 6 and 3, a simple R code also leads to the conclusion

isok <- function(a,b){
 s=as.numeric(unlist(strsplit(as.character(sum(10^(6:0)*a)+
   sum(10^(6:0)*b)),"")))
 if (length(s)==7){ goal=FALSE}else{
   goal=(length(unique(c(a,b,s)))==10)&(a[2]==s[6])&
         (s[8]==a[6])&(s[2]==a[7])&(b[2]==s[4])}
 return(goal)}

pasok <- function(T=1e3){ 
for (t in 1:T){ 
  a[1]=a[5]=a[6]=6;a[7]=3 
  digs=sample(c(0:2,4,5,7:9),4) 
  a[2:4]=digs[1:3] b[1]=a[1];b[2]=digs[4];
  b[3]=a[7];b[4]=a[4];b[5]=a[1];b[6:7]=a[7] 
  if (isok(a=a,b=b)) 
     print(rbind(a,b))}} 

> pasok()
  [,1] [,2] [,3] [,4] [,5] [,6] [,7]
a    6    2    4    7    6    6    3
b    6    8    3    7    6    3    3

which sum is 13085296.

hitting a wall

Posted in Books, Kids, R, Statistics, University life with tags , , , , , on July 5, 2018 by xi'an

Once in a while, or a wee bit more frequently (!), it proves impossible to communicate with a contributor of a question on X validated. A recent instance was about simulating from a multivariate kernel density estimate where the kernel terms at x¹,x²,… are Gaussian kernels applied to the inverses of the norms |x-x¹|, |x-x²|,… rather than to the norms as in the usual formulation. The reason for using this type of kernel is unclear, as it certainly does not converge to an estimate of the density of the sample x¹,x²,…  as the sample size grows, since it excludes a neighbourhood of each point in the sample. Since the kernel term tends to a non-zero constant at infinity, the support of the density estimate is restricted to the hypercube [0,1]x…x[0,1], again with unclear motivations. No mention being made of the bandwidth adopted for this kernel. If one takes this exotic density as a given, the question is rather straightforward as the support is compact, the density bounded and a vanilla accept-reject can be implemented. As illustrated by the massive number of comments on that entry, it did not work as the contributor adopted a fairly bellicose attitude about suggestions from moderators on that site and could not see the point in our requests for clarification, despite plotting a version of the kernel that had its maximum [and not its minimum] at x¹… After a few attempts, including writing a complete answer, from which the above graph is taken (based on an initial understanding of the support being for (x-x¹), …), I gave up and deleted all my entries.On that question.