partial rankings and aggregate ranks

Posted in Books, Kids, R, Statistics, Travel, University life with tags , , , , , , , , , on March 22, 2023 by xi'an

When interviewing impressive applicants from a stunning variety of places and background for fellows in our Data Science for Social Good program (in Warwick and Kaiserslautern) this summer, we came through the common conundrum of comparing ranks while each of us only meeting a subset of the candidates. Over a free morning, I briefly thought of the problem (while swimming) and then wrote a short R code to infer about an aggregate ranking, ρ, based on a simple model, namely a Poisson distribution on the distance between an individual’s ranking and the aggregate

$d(r_i,\rho)\sim\mathcal P(\lambda)$

a uniform distribution on the missing ranks as well as on the aggregate, and a non-informative prior on λ. Leading to a three step Gibbs sampler for the completion and the simulation of ρ and λ.

I am aware that the problem has been tackled in many different ways, including Bayesian ones (as in Deng et al., 2014) and local ones, but this was a fun exercise. Albeit we did not use any model in the end!

Posted in Books, Kids, R with tags , , , , , on March 3, 2021 by xi'an

The Riddle this week is rather straightforward to explain: stacking identical objects (bars of length and mass two, say) on top of one another so that the center of each new bar is uniformly distributed along the previous bar, what is the distribution of the number of bars when the stack collapses? If I am not confused, the stack collapses the first time the centre of gravity of an upper stack leaves the interval represented by the bar just below. Namely

$\left|\frac{1}{N-j} \sum_{i=j+1}^N x_i -x_j\right|>1$

when the xi are the bar centres, or equivalently

$\max_{2\le j\le N-1} \left|\frac{1}{N-j} \sum_{i=j+1}^N \sum_{k=j+1}^i\epsilon_i \right|>1$

where the ε_i‘s are U(-1,1). Which is straightforward to code in R by looking at means of cumulated sums.

easy and uneasy riddles

Posted in Books, Kids, R with tags , , , , , on February 2, 2021 by xi'an

On 15 January, The Riddler had both a straightforward and a challenging riddles. The first one was to optimise the choice of a real number d with the utility function U(d,θ)=d ℑ(θ>d), when θ is Uniform(0,100). Leading unsurprisingly to d=50…

The tough(er) one was to solve a form of sudoku where the 24 entries of a 8×3 table are integers in {1,…,9} and the information is provided by the row-wise and column-wise products of these integers. The vertical margins are

294, 216, 135, 98, 112, 84, 245, 40

and the horizontal margins are

8 890 560, 156 800, 55 566

After an unsuccessful brute-force (and pseudo-annealed) attempt achieving a minimum error of 127, although using the prime factor decompositions of these 11 margins, I realised that some entries were known: e.g., 7 at (1,2), 5 at (3,2), and 7 at (7,3), and (much later) that the (huge) product value for the first column implied that each term in that column had to be the maximal possible value for the corresponding rows, except for 5 on row 7. This leads to the starting grid

    [,1] [,2] [,3]
[1,]    7    7    6
[2,]    9    0    0
[3,]    9    5    3
[4,]    7    0    0
[5,]    8    0    0
[6,]    7    0    0
[7,]    5    7    7
[8,]    8    0    0


and an additional and obvious exclusion based on the absence of 3’s in the second column, of 5’s and 2’s in the third column shows there was a unique solution

    [,1] [,2] [,3]
[1,]    7    7    6
[2,]    9    8    3
[3,]    9    5    3
[4,]    7    2    7
[5,]    8    2    7
[6,]    7    4    3
[7,]    5    7    7
[8,]    8    5    1


as also demonstrated by a complete exploration with R:

Try it online!

Fermat’s Riddle

Posted in Books, Kids, R with tags , , , , , , , , , , on October 16, 2020 by xi'an

·A Fermat-like riddle from the Riddler (with enough room to code on the margin)

An  arbitrary positive integer N is to be written as a difference of two distinct positive integers. What are the impossible cases and else can you provide a list of all distinct representations?

Since the problem amounts to finding a>b>0 such that

$N=a^2-b^2=(a-b)(a+b)$

both (a+b) and (a-b) are products of some of the prime factors in the decomposition of N and both terms must have the same parity for the average a to be an integer. This eliminates decompositions with a single prime factor 2 (and N=1). For other cases, the following R code (which I could not deposit on tio.run because of the packages R.utils!) returns a list

library(R.utils)
library(numbers)
bitz<-function(i,m) #int2bits
c(rev(as.binary(i)),rep(0,m))[1:m]
ridl=function(n){
a=primeFactors(n)
if((n==1)|(sum(a==2)==1)){
print("impossible")}else{
m=length(a);g=NULL
for(i in 1:2^m){
b=bitz(i,m)
if(((d<-prod(a[!!b]))%%2==(e<-prod(a[!b]))%%2)&(d<e))
g=rbind(g,c(k<-(e+d)/2,l<-(e-d)/2))}
return(g[!duplicated(g[,1]-g[,2]),])}}


For instance,

> ridl(1456)
[,1] [,2]
[1,]  365  363
[2,]  184  180
[3,]   95   87
[4,]   59   45
[5,]   40   12
[6,]   41   15


Checking for the most prolific N, up to 10⁶, I found that N=6720=2⁶·3·5·7 produces 20 different decompositions. And that N=887,040=2⁸·3²·5·7·11 leads to 84 distinct differences of squares.

le compte est bon

Posted in Books, Kids, R with tags , , , , , , on July 22, 2020 by xi'an

The Riddler asks how to derive 24 from (1,2,3,8), with each number appearing once and all operations (x,+,/,-,^) allowed. This reminded me of a very old TV show on French TV, called Le compte est bon!, where players were given 5 or 6 numbers and supposed to find a given total within 60 ,seconds. Unsurprisingly there is an online solver for this game, as shown above, e.g., 24=(8+3+1)x2. But it proves unable to solve the puzzle when the input is 24 and (2,3,3,4), only using 2,3 and 4, since 24=2x3x4. Introducing powers as well, since exponentiation is allowed, leads to two solutions, (4-2)³x3=(4/2)³x3=(3²-3)x4=3/(2/4)³=24… Not fun!

I however rewrote an R code to check whether 24 was indeed a possibility allowed with such combinations but could not find an easy way to identify which combination was used, although a pedestrian version eventually worked! And exhibited the slightly less predictable 43/2x3=24!