Archive for R

Le Monde puzzle [#1111]

Posted in R, Statistics with tags , , , , , , on September 18, 2019 by xi'an

Another low-key arithmetic problem as Le Monde current mathematical puzzle:

Notice that there are 10 numbers less than, and prime with 11, 100 less than and prime with 101, 1000 less than, and prime with 1111? What is the smallest integer N such that the cardinal of the set of M<N prime with N is 10⁴? What is the smallest multiple of 1111 using only different digits? And what is the largest?

As it is indeed a case for brute force resolution as in the following R code

  for (i in 2:(n-1)) many=many+coprime(i,n)

  while (many!=1e4) many=homanycoprim(n<-n+1)

which returns n=10291 as the smallest value of N.  For the other two questions, the usual integer2digit function is necessary

  while (mul<1e6) {
    x=as.numeric(strsplit(as.character(n*mul), "")[[1]])
    while (sum(duplicated(x))!=0){
     x=as.numeric(strsplit(as.character(n*mul), "")[[1]])}

leading to 241,087 as the smallest and 9,875,612,340 as the largest (with 10 digits).

Le Monde puzzle [#1110]

Posted in Books, Kids, R, Travel with tags , , , , , , , , , on September 16, 2019 by xi'an

A low-key sorting problem as Le Monde current mathematical puzzle:

If the numbers from 1 to 67 are randomly permuted and if the sorting algorithm consists in picking a number i with a position higher than its rank i and moving it at the correct i-th position, what is the maximal number of steps to sort this set of integers when the selected integer is chosen optimaly?

As the question does not clearly state what happens to the number j that stood in the i-th position, I made the assumption that the entire sequence from position i to position n is moved one position upwards (rather than having i and j exchanged). In which case my intuition was that moving the smallest moveable number was optimal, resulting in the following R code

sorite<-function(permu){ n=length(permu) p=0 while(max(abs(permu-(1:n)))>0){

which takes at most n-1 steps to reorder the sequence. I however checked this insertion sort was indeed the case through a recursive function

resorite<-function(permu){ n=length(permu);p=0 while(max(abs(permu-(1:n)))>0){
    if (length(cand)==1){
      for (i in cand){
        if (rol<sol)sol=rol}

which did confirm my intuition.

R puzzle

Posted in Statistics with tags , , on September 12, 2019 by xi'an

Can you guess the meaning of the following R code


If not (!), the explanation is provided in Robin’s answer to a codegolf puzzle.

R wins COPSS Award!

Posted in Statistics with tags , , , , , , , , on August 4, 2019 by xi'an

Hadley Wickham from RStudio has won the 2019 COPSS Award, which expresses a rather radical switch from the traditional recipient of this award in that this recognises his many contributions to the R language and in particular to RStudio. The full quote for the nomination is his  “influential work in statistical computing, visualisation, graphics, and data analysis” including “making statistical thinking and computing accessible to a large audience”. With the last part possibly a recognition of the appeal of Open Source… (I was not in Denver for the awards ceremony, having left after the ABC session on Monday morning. Unfortunately, this session only attracted a few souls, due to the competition of twentysome other sessions, including, excusez du peu!, David Dunson’s Medallion Lecture and Michael Lavine’s IOL on the likelihood principle. And Marco Ferreira’s short-course on Bayesian time series. This is the way the joint meeting goes, but it is disappointing to reach so few people.)

Gibbs sampling with incompatible conditionals

Posted in Books, Kids, R, Statistics with tags , , , , , , on July 23, 2019 by xi'an

An interesting question (with no clear motivation) on X validated wondering why a Gibbs sampler produces NAs… Interesting because multi-layered:

  1. The attached R code indeed produces NAs because it calls the Negative Binomial Neg(x¹,p) random generator with a zero success parameter, x¹=0, which automatically returns NAs. This can be escaped by returning a one (1) instead.
  2. The Gibbs sampler is based on a Bin(x²,p) conditional for X¹ and a Neg(x¹,p) conditional for X². When using the most standard version of the Negative Binomial random variate as the number of failures, hence supported on 0,1,2…. these two conditionals are incompatible, i.e., there cannot be a joint distribution behind that returns these as conditionals, which makes the limiting behaviour of the Markov chain harder to study. It however seems to converge to a distribution close to zero, which is not contradictory with the incompatibility property: the stationary joint distribution simply does not enjoy the conditionals used by the Gibbs sampler as its conditionals.
  3. When using the less standard version of the Negative Binomial random variate understood as a number of attempts for the conditional on X², the two conditionals are compatible and correspond to a joint measure proportional to x_1^{-1} {x_1 \choose x_2} p^{x_2} (1-p)^{x_1-x_2}, however this pmf does not sum up to a finite quantity (as in the original Gibbs for Kids example!), hence the resulting Markov chain is at best null recurrent, which seems to be the case for p different from ½. This is unclear to me for p=½.

CRAN does not validate R packages!

Posted in pictures, R, University life with tags , , , , , , , , , , on July 10, 2019 by xi'an

A friend called me the other day for advice on how to submit an R package to CRAN along with a proof his method was mathematically sound. I replied with some items of advice taken from my (limited) experience with submitting packages. And with the remark that CRAN would not validate the mathematical contents of the associated package manual. Nor even the validity of the R code towards delivering the right outcome as stated in the manual. This shocked him quite seriously as he thought having a package accepted by CRAN was a stamp of validation of both the method and the R code. It would be nice of course but would require so much manpower that it seems unrealistic. Some middle ground is to aim at a journal or a peer community validation where both code and methods are vetted. Which happens for instance with the Journal of Computational and Graphical Statistics. Or the Journal of Statistical Software (which should revise its instructions to authors that states “The majority of software published in JSS is written in S, MATLAB, SAS/IML, C++, or Java”. S, really?!)

As for the validity of the latest release of R (currently R-3.6.1 which came out on 2019-07-05, named Action of the Toes!), I figure the bazillion R programs currently running should be able to detect any defect pretty fast, although awareness of the incredible failure of sample() reported in an earlier post took a while to appear.

Le Monde puzzle [#1105]

Posted in Kids, R with tags , , , , , , on July 8, 2019 by xi'an

Another token game as Le Monde mathematical puzzle:

Archibald and Beatrix play with a pile of n>100 tokens, sequentially picking m tokens from the pile with m being a prime number [including m=1] or a multiple of 6, the winner taking the last tokens. If Beatrix knows n and proposes to Archibald to start, what is the value of n?

Which cannot be solved in a few lines of R code:

  if(k(i)) g=c(g,i) else{
  if(o) g=c(g,i) else n=c(n,i)}

since it returned no unsuccessful value above 100! With 4, 8, 85, 95, and 99 as predecessors. A rather surprising outcome and a big gap that most certainly has a straightforward explanation! Or a lack of understanding from yours truly: this post appears after the solution was published in Le Monde and I am more bemused than ever since the losing numbers in the journal are given as 4, 8, 85, … 89, and 129. With the slight hiccup that 89 is a prime number…. The other argument in the solution that there can only be five such losers is well-taken since there are only five possible non-zero remainders in the division by 6.