Archive for the R Category

Gibbs sampling with incompatible conditionals

Posted in Books, Kids, R, Statistics with tags , , , , , , on July 23, 2019 by xi'an

An interesting question (with no clear motivation) on X validated wondering why a Gibbs sampler produces NAs… Interesting because multi-layered:

  1. The attached R code indeed produces NAs because it calls the Negative Binomial Neg(x¹,p) random generator with a zero success parameter, x¹=0, which automatically returns NAs. This can be escaped by returning a one (1) instead.
  2. The Gibbs sampler is based on a Bin(x²,p) conditional for X¹ and a Neg(x¹,p) conditional for X². When using the most standard version of the Negative Binomial random variate as the number of failures, hence supported on 0,1,2…. these two conditionals are incompatible, i.e., there cannot be a joint distribution behind that returns these as conditionals, which makes the limiting behaviour of the Markov chain harder to study. It however seems to converge to a distribution close to zero, which is not contradictory with the incompatibility property: the stationary joint distribution simply does not enjoy the conditionals used by the Gibbs sampler as its conditionals.
  3. When using the less standard version of the Negative Binomial random variate understood as a number of attempts for the conditional on X², the two conditionals are compatible and correspond to a joint measure proportional to x_1^{-1} {x_1 \choose x_2} p^{x_2} (1-p)^{x_1-x_2}, however this pmf does not sum up to a finite quantity (as in the original Gibbs for Kids example!), hence the resulting Markov chain is at best null recurrent, which seems to be the case for p different from ½. This is unclear to me for p=½.

a non-riddle

Posted in Books, Kids, R with tags , , on July 12, 2019 by xi'an

Unless I missed a point in the last riddle from the Riddler, there is very little to say about it:

Given N ocre balls, N aquamarine balls, and two urns, what is the optimal way to allocate the balls to the urns towards drawing an ocre ball with no urn being empty?

Both my reasoning and a two line exploration code led to having one urn with only one ocre ball (and no acquamarine ball) and all the other balls in the second urn.

odz<-function(n,m,t) 2*m/n+(t-2*m)/(t-n)
for (n in 1:(N-1))
  for (m in 1:(trunc(N/2)-1))

CRAN does not validate R packages!

Posted in pictures, R, University life with tags , , , , , , , , , , on July 10, 2019 by xi'an

A friend called me the other day for advice on how to submit an R package to CRAN along with a proof his method was mathematically sound. I replied with some items of advice taken from my (limited) experience with submitting packages. And with the remark that CRAN would not validate the mathematical contents of the associated package manual. Nor even the validity of the R code towards delivering the right outcome as stated in the manual. This shocked him quite seriously as he thought having a package accepted by CRAN was a stamp of validation of both the method and the R code. It would be nice of course but would require so much manpower that it seems unrealistic. Some middle ground is to aim at a journal or a peer community validation where both code and methods are vetted. Which happens for instance with the Journal of Computational and Graphical Statistics. Or the Journal of Statistical Software (which should revise its instructions to authors that states “The majority of software published in JSS is written in S, MATLAB, SAS/IML, C++, or Java”. S, really?!)

As for the validity of the latest release of R (currently R-3.6.1 which came out on 2019-07-05, named Action of the Toes!), I figure the bazillion R programs currently running should be able to detect any defect pretty fast, although awareness of the incredible failure of sample() reported in an earlier post took a while to appear.

Le Monde puzzle [#1105]

Posted in Kids, R with tags , , , , , , on July 8, 2019 by xi'an

Another token game as Le Monde mathematical puzzle:

Archibald and Beatrix play with a pile of n>100 tokens, sequentially picking m tokens from the pile with m being a prime number [including m=1] or a multiple of 6, the winner taking the last tokens. If Beatrix knows n and proposes to Archibald to start, what is the value of n?

Which cannot be solved in a few lines of R code:

  if(k(i)) g=c(g,i) else{
  if(o) g=c(g,i) else n=c(n,i)}

since it returned no unsuccessful value above 100! With 4, 8, 85, 95, and 99 as predecessors. A rather surprising outcome and a big gap that most certainly has a straightforward explanation! Or a lack of understanding from yours truly: this post appears after the solution was published in Le Monde and I am more bemused than ever since the losing numbers in the journal are given as 4, 8, 85, … 89, and 129. With the slight hiccup that 89 is a prime number…. The other argument in the solution that there can only be five such losers is well-taken since there are only five possible non-zero remainders in the division by 6.

Le Monde puzzle [#1104]

Posted in Kids, R with tags , , , , on June 18, 2019 by xi'an

A palindromic Le Monde mathematical puzzle:

In a monetary system where all palindromic amounts between 1 and 10⁸ have a coin, find the numbers less than 10³ that cannot be paid with less than three coins. Find if 20,191,104 can be paid with two coins. Similarly, find if 11,042,019 can be paid with two or three coins.

Which can be solved in a few lines of R code:


and produces 9 amounts that cannot be paid with one or two coins.

21 32 43 54 65 76 87 98 201

It is also easy to check that three coins are enough to cover all amounts below 10³. For the second question, starting with n¹=20,188,102,  a simple downward search of palindromic pairs (n¹,n²) such that n¹+n²=20,188,102 led to n¹=16,755,761 and n²=3,435,343. And starting with 11,033,011, the same search does not produce any solution, while there are three coins such that n¹+n²+n³=11,042,019, for instance n¹=11,022,011, n²=20,002, and n³=6.

another attempt at code golf

Posted in Books, Kids, R with tags , , , on June 12, 2019 by xi'an

I had another lazy weekend go at code golf, trying to code in the most condensed way the following task. Provided with a square matrix A of positive integers, keep iterating the steps

  • take the highest square 𝑥² in A.
  • find the smallest adjacent neighbour 𝑛
  • replace with x and n with nx

until no square is left (with neighbour defined as either horizontally or vertically and without wrapping around). While I managed a 217 bytes solution, compared with Robin’s 179b improvement, which remains surprising readable!, the puzzle offers two further questions:

  1. is there a non-iterative way to find the final matrix B?
  2. the puzzle assumes that A satisfies that at each step, the highest square and the smallest neighbour n will be unique, and that the sequence will not repeat forever. Is there a fool-proof way to check this is the case?


riddles on Egyptian fractions and Bernoulli factories

Posted in Books, Kids, R with tags , , , , , , , , , , , , , on June 11, 2019 by xi'an

Two fairy different riddles on the weekend Riddler. The first one is (in fine) about Egyptian fractions: I understand the first one as

Find the Egyptian fraction decomposition of 2 into 11 distinct unit fractions that maximises the smallest fraction.

And which I cannot solve despite perusing this amazing webpage on Egyptian fractions and making some attempts at brute force  random exploration. Using Fibonacci’s greedy algorithm. I managed to find such decompositions

2 = 1 +1/2 +1/6 +1/12 +1/16 +1/20 +1/24 +1/30 +1/42 +1/48 +1/56

after seeing in this short note

2 = 1 +1/3 +1/5 +1/7 +1/9 +1/42 +1/15 +1/18 +1/30 +1/45 +1/90

And then Robin came with the following:

2 = 1 +1/4 +1/5 +1/8 +1/10 +1/12 +1/15 +1/20 +1/21 +1/24 +1/28

which may prove to be the winner! But there is even better:

2 = 1 +1/5 +1/6 +1/8 +1/9 +1/10 +1/12 +1/15 +1/18 +1/20 +1/24

The second riddle is a more straightforward Bernoulli factory problem:

Given a coin with a free-to-choose probability p of head, design an experiment with a fixed number k of draws that returns three outcomes with equal probabilities.

For which I tried a brute-force search of all possible 3-partitions of the 2-to-the-k events for a range of values of p from .01 to .5 and for k equal to 3,4,… But never getting an exact balance between the three groups. Reading later the solution on the Riddler, I saw that there was an exact solution for 4 draws when


Augmenting the precision of my solver (by multiplying all terms by 100), I indeed found a difference of

> solver((3-sqrt(3*(4*sqrt(6)-9)))/6,ba=1e5)[1]
[1] 8.940697e-08

which means an error of 9 x 100⁻⁴ x 10⁻⁸, ie roughly 10⁻¹⁵.