Archive for the R Category

Le Monde puzzle [#1104]

Posted in Kids, R with tags , , , , on June 18, 2019 by xi'an

A palindromic Le Monde mathematical puzzle:

In a monetary system where all palindromic amounts between 1 and 10⁸ have a coin, find the numbers less than 10³ that cannot be paid with less than three coins. Find if 20,191,104 can be paid with two coins. Similarly, find if 11,042,019 can be paid with two or three coins.

Which can be solved in a few lines of R code:

coin=sort(c(1:9,(1:9)*11,outer(1:9*101,(0:9)*10,"+")))
amounz=sort(unique(c(coin,as.vector(outer(coin,coin,"+")))))
amounz=amounz[amounz<1e3]

and produces 9 amounts that cannot be paid with one or two coins.

21 32 43 54 65 76 87 98 201

It is also easy to check that three coins are enough to cover all amounts below 10³. For the second question, starting with n¹=20,188,102,  a simple downward search of palindromic pairs (n¹,n²) such that n¹+n²=20,188,102 led to n¹=16,755,761 and n²=3,435,343. And starting with 11,033,011, the same search does not produce any solution, while there are three coins such that n¹+n²+n³=11,042,019, for instance n¹=11,022,011, n²=20,002, and n³=6.

another attempt at code golf

Posted in Books, Kids, R with tags , , , on June 12, 2019 by xi'an

I had another lazy weekend go at code golf, trying to code in the most condensed way the following task. Provided with a square matrix A of positive integers, keep iterating the steps

  • take the highest square 𝑥² in A.
  • find the smallest adjacent neighbour 𝑛
  • replace with x and n with nx

until no square is left (with neighbour defined as either horizontally or vertically and without wrapping around). While I managed a 217 bytes solution, compared with Robin’s 179b improvement, which remains surprising readable!, the puzzle offers two further questions:

  1. is there a non-iterative way to find the final matrix B?
  2. the puzzle assumes that A satisfies that at each step, the highest square and the smallest neighbour n will be unique, and that the sequence will not repeat forever. Is there a fool-proof way to check this is the case?

 

riddles on Egyptian fractions and Bernoulli factories

Posted in Books, Kids, R with tags , , , , , , , , , , , , , on June 11, 2019 by xi'an

Two fairy different riddles on the weekend Riddler. The first one is (in fine) about Egyptian fractions: I understand the first one as

Find the Egyptian fraction decomposition of 2 into 11 distinct unit fractions that maximises the smallest fraction.

And which I cannot solve despite perusing this amazing webpage on Egyptian fractions and making some attempts at brute force  random exploration. Using Fibonacci’s greedy algorithm. I managed to find such decompositions

2 = 1 +1/2 +1/6 +1/12 +1/16 +1/20 +1/24 +1/30 +1/42 +1/48 +1/56

after seeing in this short note

2 = 1 +1/3 +1/5 +1/7 +1/9 +1/42 +1/15 +1/18 +1/30 +1/45 +1/90

And then Robin came with the following:

2 = 1 +1/4 +1/5 +1/8 +1/10 +1/12 +1/15 +1/20 +1/21 +1/24 +1/28

which may prove to be the winner! But there is even better:

2 = 1 +1/5 +1/6 +1/8 +1/9 +1/10 +1/12 +1/15 +1/18 +1/20 +1/24

The second riddle is a more straightforward Bernoulli factory problem:

Given a coin with a free-to-choose probability p of head, design an experiment with a fixed number k of draws that returns three outcomes with equal probabilities.

For which I tried a brute-force search of all possible 3-partitions of the 2-to-the-k events for a range of values of p from .01 to .5 and for k equal to 3,4,… But never getting an exact balance between the three groups. Reading later the solution on the Riddler, I saw that there was an exact solution for 4 draws when

p=\frac{3-\sqrt{3(4\sqrt{9}-6)}}{6}

Augmenting the precision of my solver (by multiplying all terms by 100), I indeed found a difference of

> solver((3-sqrt(3*(4*sqrt(6)-9)))/6,ba=1e5)[1]
[1] 8.940697e-08

which means an error of 9 x 100⁻⁴ x 10⁻⁸, ie roughly 10⁻¹⁵.

A precursor of ABC-Gibbs

Posted in Books, R, Statistics with tags , , , , , , , , , , on June 7, 2019 by xi'an

Following our arXival of ABC-Gibbs, Dennis Prangle pointed out to us a 2016 paper by Athanasios Kousathanas, Christoph Leuenberger, Jonas Helfer, Mathieu Quinodoz, Matthieu Foll, and Daniel Wegmann, Likelihood-Free Inference in High-Dimensional Model, published in Genetics, Vol. 203, 893–904 in June 2016. This paper contains a version of ABC Gibbs where parameters are sequentially simulated from conditionals that depend on the data only through small dimension conditionally sufficient statistics. I had actually blogged about this paper in 2015 but since then completely forgotten about it. (The comments I had made at the time still hold, already pertaining to the coherence or lack thereof of the sampler. I had also forgotten I had run an experiment of an exact Gibbs sampler with incoherent conditionals, which then seemed to converge to something, if not the exact posterior.)

All ABC algorithms, including ABC-PaSS introduced here, require that statistics are sufficient for estimating the parameters of a given model. As mentioned above, parameter-wise sufficient statistics as required by ABC-PaSS are trivial to find for distributions of the exponential family. Since many population genetics models do not follow such distributions, sufficient statistics are known for the most simple models only. For more realistic models involving multiple populations or population size changes, only approximately-sufficient statistics can be found.

While Gibbs sampling is not mentioned in the paper, this is indeed a form of ABC-Gibbs, with the advantage of not facing convergence issues thanks to the sufficiency. The drawback being that this setting is restricted to exponential families and hence difficult to extrapolate to non-exponential distributions, as using almost-sufficient (or not) summary statistics leads to incompatible conditionals and thus jeopardise the convergence of the sampler. When thinking a wee bit more about the case treated by Kousathanas et al., I am actually uncertain about the validation of the sampler. When tolerance is equal to zero, this is not an issue as it reproduces the regular Gibbs sampler. Otherwise, each conditional ABC step amounts to introducing an auxiliary variable represented by the simulated summary statistic. Since the distribution of this summary statistic depends on more than the parameter for which it is sufficient, in general, it should also appear in the conditional distribution of other parameters. At least from this Gibbs perspective, it thus relies on incompatible conditionals, which makes the conditions proposed in our own paper the more relevant.

an attempt at code golf

Posted in Kids, R with tags , , , , , , , , , on May 15, 2019 by xi'an

Having discovered codegolf on Stack Exchange a few weeks ago, I spotted a few interesting puzzles since then but only got the opportunity at a try over a quiet and rainy weekend (and Robin being on vacation)! The challenge was to write an R code for deciding whether or not a given integer n is congruent or not, when congruent means that it is the surface of a rectangle triangle with all three sides rational. The question included a pointer to the Birch and Swinnerton-Dyer conjecture as a mean to check congruence although the real solution was provided by Tunnell’s Theorem, which states that n is congruent if and only if the number of integer solutions to 2x²+y²+8z²=n is twice as much as the number of integer solutions to 2x²+y²+32z²=n if n is odd and  the number of integer solutions to 8x²+y²+16z²=n is twice as much as the number of integer solutions to 8x²+y²+64z²=n if n is even. Although this is only true for squared-free integers. (I actually spent more time on figuring out the exact wording of the theorem than on optimising the R code!)

My original solution

p=function(n){
 for (i in(n:2)^2)if(n%%i<1)n=n/i
 if(n%%2){d=8;f=2;g=16}else{d=2;f=1;g=8}
 A=0;b=(-n:n)^2
 for(x in d*b)for(y in x+f*b)for(z in g*b)
  A=A+(y+z==n)-2*(y+4*z==n)
 A==0}

was quite naïve, as shown by the subsequent improvements by senior players, like the final (?) version of Guiseppe:

function(n){b=(-n:n)^2
for(i in b[b>0])n=n/i^(!n%%i)
P=2^(n%%2)
o=outer
!sum(!o(y<-o(8/P*b,2*b,"+")/P-n,z<-16/P*b,"+"),-2*!o(y,4*z,"+"))}

exhibiting a load of code golf tricks, from using an anonymous function to renaming functions with a single letter, to switching from integers to booleans and back with the exclamation mark.

easy Riddler

Posted in Kids, R with tags , , , on May 10, 2019 by xi'an

The riddle of the week is rather standard probability calculus

If N points are generated at random places on the perimeter of a circle, what is the probability that you can pick a diameter such that all of those points are on only one side of the newly halved circle?

Since it is equivalent to finding the range of N Uniform variates less than ½. And since the range of N Uniform variates is distributed as a Be(N-1,2) random variate. The resulting probability, which happens to be exactly N/2^{N-1}, is decreasing exponentially, as shown below…

a perfectly normally distributed sample

Posted in R, Statistics with tags , , , , , , , , on May 9, 2019 by xi'an

When I saw this title on R-bloggers, I was wondering how “more perfect” a Normal sample could be when compared with the outcome of rnorm(n). Hence went checking the original blog on bayestestR in search of more information. Which was stating nothing more than how to generate a sample is perfectly normal by using the rnorm_perfect function. Still unsure of the meaning, I contacted one of the contributors who replied very quickly

…that’s actually a good question. I would say an empirical sample having characteristics as close as possible to a cannonic gaussian distribution.
and again leaving me hungering for more details. I thus downloaded the package bayestestR and opened the rnorm_perfect function. Which is simply the sequence of n-quantiles
stats::qnorm(seq(1/n, 1 – 1/n, length.out = n), mean, sd)
which I would definitely not call a sample as it has nothing random. And perfect?! Not really, unless one associates randomness and imperfection.