Archive for the R Category

a very quick Riddle

Posted in Books, Kids, pictures, R with tags , , , , , , on January 22, 2020 by xi'an

A very quick Riddler’s riddle last week with the question

Find the (integer) fraction with the smallest (integer) denominator strictly located between 1/2020 and 1/2019.

and the brute force resolution

for (t in (2020*2019):2021){ 
   if (a*2019<t) sol=c(a,t)}

leading to 2/4039 as the target. Note that


Le Monde puzzle [#1127]

Posted in Books, Kids, R, Statistics with tags , , , , on January 17, 2020 by xi'an

A permutation challenge as Le weekly Monde current mathematical puzzle:

When considering all games between 20 teams, of which 3 games have not yet been played, wins bring 3 points, losses 0 points, and draws 1 point (each). If the sum of all points over all teams and all games is 516, was is the largest possible number of teams with no draw in every game they played?

The run of a brute force R simulation of 187 purely random games did not produce enough acceptable tables in a reasonable time. So I instead considered that a sum of 516 over 187 games means solving 3a+2b=516 and a+b=187, leading to 142 3’s to allocate and 45 1’s. Meaning for instance this realisation of an acceptable table of game results

    sample(c(rep(1,45),rep(3,142)))* #1's and 3'
    (1-2*(runif(20*19/2-3)<.5)) #sign

Running 10⁶ random realisations of such matrices with no constraint whatsoever provided a solution with] 915,524 tables with no no-draws, 81,851 tables with 19 teams with some draws, 2592 tables with 18 with some draws and 3 tables with 17 with some draws. However, given that 10*9=90 it seems to me that the maximum number should be 10 by allocating all 90 draw points to the same 10 teams, and 143 3’s at random in the remaining games, and I reran a simulated annealing version (what else?!), reaching a maximum 6 teams with no draws. Nowhere near 10, though!

an elegant sampler

Posted in Books, Kids, R, University life with tags , , , , , , , on January 15, 2020 by xi'an

Following an X validated question on how to simulate a multinomial with fixed average, W. Huber produced a highly elegant and efficient resolution with the compact R code

tabulate(*n, s-n) %% n + 1, n) + 1

where k is the number of classes, n the number of draws, and s equal to n times the fixed average. The R function is an alternative to sample that seems faster. Breaking the outcome of*n, s-n)

as nonzero positions in an n x (k-1) matrix and adding a adding a row of n 1’s leads to a simulation of integers between 1 and k by counting the 1’s in each of the n columns, which is the meaning of the above picture. Where the colour code is added after counting the number of 1’s. Since there are s 1’s in this matrix, the sum is automatically equal to s. Since the s-n positions are chosen uniformly over the n x (k-1) locations, the outcome is uniform. The rest of the R code is a brutally efficient way to translate the idea into a function. (By comparison, I brute-forced the question by suggesting a basic Metropolis algorithm.)

Le Monde puzzle [#1120]

Posted in Books, Kids, pictures, R with tags , , , , on January 14, 2020 by xi'an

A board game as Le weekly Monde current mathematical puzzle:

11 players in a circle and 365 tokens first owned by a single player. Players with at least two tokens can either remove one token and give another one left or move two right and one left. How quickly does the game stall, how many tokens are left, and where are they?

The run of a R simulation like

  if (max(bob)>1){

always provides a solution

> bob
 [1] 1 0 1 1 0 1 1 0 1 0 0

with six ones at these locations. However the time it takes to reach this frozen configuration varies, depending on the sequence of random choices.


postdoc at Warwick on robust SMC [call]

Posted in Kids, pictures, R, Statistics, University life with tags , , , , , , , , on January 11, 2020 by xi'an

Here is a call for a research fellow at the University of Warwick to work with Adam Johansen and Théo Damoulas on the EPSRC and Lloyds Register Foundaton funded project “Robust Scalable Sequential Monte Carlo with application to Urban Air Quality”. To quote

The position will be based primarily at the Department of Statistics of the University of Warwick. The post holder will work closely in collaboration with the rest of the project team and another postdoctoral researcher to be recruited shortly to work within the Data Centric Engineering programme at the Alan Turing Institute in London. The post holder will be expected to visit the Alan Turing Institute regularly.

Candidates with strong backgrounds in the mathematical analysis of stochastic algorithms or sequential Monte Carlo methods are particularly encouraged to apply. Closing date is 19 Jan 2020.

Metropolis in 95 characters

Posted in pictures, R, Statistics, Travel with tags , , , , , , , , on January 2, 2020 by xi'an

Here is an R function that produces a Metropolis-Hastings sample for the univariate log-target f when the later is defined outside as another function. And when using a Gaussian random walk with scale one as proposal. (Inspired from a X validated question.)


The function is definitely not optimal, crashes for values of T larger than 580 (unless one modifies the stack size), and operates the most basic version of a Metropolis-Hastings algorithm. But as a codegolf challenge (on a late plane ride), this was a fun exercise.

Le Monde puzzle [#1124]

Posted in Books, Kids, R with tags , , , , , on December 29, 2019 by xi'an

A prime number challenge [or rather two!] as Le weekly Monde current mathematical puzzle:

When considering the first two integers, 1 and 2, their sum is 3, a prime number. For the first four integers, 1,2,3,4, it is again possible to sum them pairwise to obtain two prime numbers, eg 3 and 7. Up to which limit is this operation feasible? And how many primes below 30,000 can write as n^p+p^n?

The run of a brute force R simulation like


provides a solution for the first question until n=14 when it stops. A direct explanation is that the number of prime numbers grows too slowly for all sums to be prime. And the second question gets solved by direct enumeration using again the is_prime R function from the primes package:

[1] 1 1
[1] 1 2
[1] 1 4
[1] 2 3
[1] 3 4