Archive for the R Category

Le Monde puzzle [#1112]

Posted in Books, Kids, R with tags , , , on October 3, 2019 by xi'an

Another low-key arithmetic problem as Le Monde current mathematical puzzle:

Find the 16 integers x¹,x²,x³,x⁴,y¹,y²,y³,y⁴,z¹,z²,z³,z⁴,w¹,w²,w³,w⁴ such that the groups x¹,y¹,z¹,w¹, &tc., are made of distinct positive integers, the sum of the x’s is 24, of the y’s 20, of the z’s 19 and of the w’s 17. Furthermore, x¹<max(y¹,z¹,w¹), x²<max(y²,z²,w²), x³<max(y³,z³,w³)=z³, and x⁴<max(y⁴,z⁴,w⁴), while x¹<x².

There are thus 4 x 3 unknowns, all bounded by either 20-3=17 or 19-3=16 for x³,y³,z³. It is then a case for brute force resolution since drawing all quadruplets by rmultinom until all conditions are satisfied

valid=function(x,y,z,w){
 (z[3]>max(y[3],w[3]))&(x[1]<x[2])&(sum(x)==24)&
 (sum(y)==20)&(sum(z)==19)&(sum(w)==17)}

returns quickly several solutions. Meaning I misread the question and missed the constraint that the four values at each step were the same up to a permutation, decreasing the number of unknowns to four, a,b,c,d (ordered). And then three because the sum of the four is 20, average of the four sums. It seems to me that the first sum of x’s being 24 and involving only a,b, and c implies that 4c is larger than 24, ie c>6, hence d>7, while a>0 and b>1, leaving only two degrees of freedom for choosing the four values, meaning that only

  1 2 7 10
1 2 8 9
1 3 7 9
1 4 7 8
2 3 7 8

are possible. Sampling at random across these possible choices and allocating the numbers at random to x,y,z, and w leads rather quickly to the solution

     [,1] [,2] [,3] [,4]
[1,]    3    7    7    7
[2,]    2    8    2    8
[3,]    7    2    8    2
[4,]    8    3    3    3

poor statistics

Posted in Books, pictures, R, Statistics, Travel, Wines with tags , , , , , , , , , , , , on September 24, 2019 by xi'an

I came over the weekend across this graph and the associated news that the county of Saint-Nazaire, on the southern border of Brittany, had a significantly higher rate of cancers than the Loire countries. The complete study written by Solenne Delacour, Anne Cowppli-Bony, amd Florence Molinié, is quite cautious about the reasons for this higher rate, even using a Bayesian Poisson-Gamma smoothing (and the R package empbaysmooth), and citing the 1991 paper by Besag, York and Mollié, but the local and national medias are quick to blame the local industries for the difference. The graph above is particularly bad in that it accumulates mortality causes that are not mutually exclusive or independent. For instance, the much higher mortality rate due to alcohol is obviously responsible for higher rates of most other entries. And indicates a sociological pattern that may or may not be due to the type of job in the area, but differs from the more rural other parts of the Loire countries. (Which, like Brittany, are already significantly above (50%) the national reference for alcohol related health issues.), and may not be strongly connected to exposition to chemicals. For instance, the rates of pulmonary cancers are mostly comparable to the national average, if higher than the rest of the Loire countries and connect with a high smoking propensity. Lymphomas are not significantly different from the regional reference. The only type of cancer that can be directly attributed to working conditions are the mesothelioma, mostly caused by asbestos exposure, which was used in ship building, a specialty of the area. Among the many possible reasons for the higher mortality of the county, the study mentions a lower exposure to medical testings (connected with the sociological composition of the area). Which would indicate the most effective policies for lowering these higher cancer and mortality rates.

Le Monde puzzle [#1111]

Posted in R, Statistics with tags , , , , , , on September 18, 2019 by xi'an

Another low-key arithmetic problem as Le Monde current mathematical puzzle:

Notice that there are 10 numbers less than, and prime with 11, 100 less than and prime with 101, 1000 less than, and prime with 1111? What is the smallest integer N such that the cardinal of the set of M<N prime with N is 10⁴? What is the smallest multiple of 1111 using only different digits? And what is the largest?

As it is indeed a case for brute force resolution as in the following R code

library(numbers)
homanycoprim=function(n){
  many=1
  for (i in 2:(n-1)) many=many+coprime(i,n)
  return(many)}

smallest=function(){
  n=1e4
  many=homanycoprim(n)
  while (many!=1e4) many=homanycoprim(n<-n+1)
  return(n)}

which returns n=10291 as the smallest value of N.  For the other two questions, the usual integer2digit function is necessary

smalldiff=function(){
  n=1111;mul=1
  while (mul<1e6) {
    x=as.numeric(strsplit(as.character(n*mul), "")[[1]])
    while (sum(duplicated(x))!=0){
     mul=mul+1
     x=as.numeric(strsplit(as.character(n*mul), "")[[1]])}
  print(n*mul);mul=mul+1}}

leading to 241,087 as the smallest and 9,875,612,340 as the largest (with 10 digits).

Le Monde puzzle [#1110]

Posted in Books, Kids, R, Travel with tags , , , , , , , , , on September 16, 2019 by xi'an

A low-key sorting problem as Le Monde current mathematical puzzle:

If the numbers from 1 to 67 are randomly permuted and if the sorting algorithm consists in picking a number i with a position higher than its rank i and moving it at the correct i-th position, what is the maximal number of steps to sort this set of integers when the selected integer is chosen optimaly?

As the question does not clearly state what happens to the number j that stood in the i-th position, I made the assumption that the entire sequence from position i to position n is moved one position upwards (rather than having i and j exchanged). In which case my intuition was that moving the smallest moveable number was optimal, resulting in the following R code

sorite<-function(permu){ n=length(permu) p=0 while(max(abs(permu-(1:n)))>0){
    j=min(permu[permu<(1:n)])
    p=p+1
    permu=unique(c(permu[(1:n)<j],j,permu[j:n]))} 
  return(p)}

which takes at most n-1 steps to reorder the sequence. I however checked this insertion sort was indeed the case through a recursive function

resorite<-function(permu){ n=length(permu);p=0 while(max(abs(permu-(1:n)))>0){
    j=cand=permu[permu<(1:n)]
    if (length(cand)==1){
      p=p+1
      permu=unique(c(permu[(1:n)<j],j,permu[j:n]))
    }else{
      sol=n^3
      for (i in cand){
        qermu=unique(c(permu[(1:n)<i],i,permu[i:n]))
        rol=resorite(qermu)
        if (rol<sol)sol=rol}
      p=p+1+sol;break()}} 
  return(p)}

which did confirm my intuition.

Le Monde puzzle [#1109]

Posted in Kids, R with tags , , , , , on September 4, 2019 by xi'an

A digital problem as Le Monde current mathematical puzzle:

Noble numbers are such that they only involve different digits and are multiple of all their digits. What is the largest noble number?

Hmmmm…. Brute force? Since the maximal number of digits is 10, one may as well try:

k=soz=9
for (t in 1:1e3){
sol=1
while (sol<10^(k-1)){ 
 u=sample(0:8,k);i=digit2int(u) 
 if (max(i%%u[u>0])==0) soz=max(soz,sol<-i)}}

which returns 9875643120… (I made the conscious choice to exclude zero from the dividers. Which was not a choice spelled out in the original question.)

Gibbs sampling with incompatible conditionals

Posted in Books, Kids, R, Statistics with tags , , , , , , on July 23, 2019 by xi'an

An interesting question (with no clear motivation) on X validated wondering why a Gibbs sampler produces NAs… Interesting because multi-layered:

  1. The attached R code indeed produces NAs because it calls the Negative Binomial Neg(x¹,p) random generator with a zero success parameter, x¹=0, which automatically returns NAs. This can be escaped by returning a one (1) instead.
  2. The Gibbs sampler is based on a Bin(x²,p) conditional for X¹ and a Neg(x¹,p) conditional for X². When using the most standard version of the Negative Binomial random variate as the number of failures, hence supported on 0,1,2…. these two conditionals are incompatible, i.e., there cannot be a joint distribution behind that returns these as conditionals, which makes the limiting behaviour of the Markov chain harder to study. It however seems to converge to a distribution close to zero, which is not contradictory with the incompatibility property: the stationary joint distribution simply does not enjoy the conditionals used by the Gibbs sampler as its conditionals.
  3. When using the less standard version of the Negative Binomial random variate understood as a number of attempts for the conditional on X², the two conditionals are compatible and correspond to a joint measure proportional to x_1^{-1} {x_1 \choose x_2} p^{x_2} (1-p)^{x_1-x_2}, however this pmf does not sum up to a finite quantity (as in the original Gibbs for Kids example!), hence the resulting Markov chain is at best null recurrent, which seems to be the case for p different from ½. This is unclear to me for p=½.

a non-riddle

Posted in Books, Kids, R with tags , , on July 12, 2019 by xi'an

Unless I missed a point in the last riddle from the Riddler, there is very little to say about it:

Given N ocre balls, N aquamarine balls, and two urns, what is the optimal way to allocate the balls to the urns towards drawing an ocre ball with no urn being empty?

Both my reasoning and a two line exploration code led to having one urn with only one ocre ball (and no acquamarine ball) and all the other balls in the second urn.

odz<-function(n,m,t) 2*m/n+(t-2*m)/(t-n)
probz=matrix(0,trunc(N/2)-1,N-1)
for (n in 1:(N-1))
  for (m in 1:(trunc(N/2)-1))
    probz[m,n]=odz(n,m,N)