## Le Monde puzzle [#838]

Posted in Books, Kids, R with tags , , , , , , , , , , on November 2, 2013 by xi'an

Another one of those Le Monde mathematical puzzles which wording is confusing to me:

The 40 members of the Academy vote for two prizes. [Like the one recently attributed to my friend and coauthor Olivier Cappé!] Once the votes are counted for both prizes, it appears that the total votes for each of the candidates take all values between 0 and 12. Is it possible that two academicians never pick the same pair of candidates?

I find it puzzling… First because the total number of votes is then equal to 78, rather than 80=2 x 40. What happened to the vote of the “last” academician? Did she or he abstain? Or did two academicians abstain on candidates for only one prize each?  Second, because of the incertitude in the original wording: can we assume with certainty that each integer between 0 and 12 is only taken once? If so, it would mean that the total number of candidates to the prizes is equal to 13. Third, the question seems unrelated with the “data”: since sums only are known, switching the votes of academicians Dupond and Dupont for candidates Durand and Martin in prize A (or in prize B) does not change the number of votes for Durand and Martin.

If we assume that each integer between 0 and 12 only appears once in the collection of the sums of the votes and that one academician abstained on both prizes, the number of candidates for one of the prizes can vary between 4 and 9, with compatible solutions provided by this R line of code:

N=5
ok=TRUE
while (ok){
prop=sample(0:12,N)
los=(1:13)[-(prop+1)]-1
ok=((sum(prop)!=39)||(sum(los)!=39))}


which returns solutions like

> N=5
> prop
[1]  9 11  7 12
> los
[1]  0  1  2  3  4  5  6  8 10


but does not help in answering the question!

Now, with Robin‘s help, (whose Corcoran memorial prize I should have mentioned in due time!), I reformulate the question as

The 40 members of the Academy vote for two prizes. Once the votes are counted for both prizes, it appears that all values between 0 and 12 are found among the total votes for each of the candidates. Is it possible that two academicians never pick the same pair of candidates?

which has a nicer solution: since all academicians have voted there are two extra votes (40-38), meaning either twice 2 or thrice 1. So there are either 14 or 15 candidates ex toto.  With at least 4 for a given prize. I then checked whether or not the above event could occur, using the following (pedestrian) R code:

for (t in 1:10^3){
#pick number of replicae
R=sample(1:2,1); cand=13+R
#pick number of literary candidates
N=sample(4:(cand-4),1)
if (R==2){
}else{
ok=TRUE
while (ok){
drop=sample(1:cand,N)
ok=((sum(prop)!=40)||(sum(los)!=40))
}
pool=NULL
for (j in 1:N)
pool=c(pool,rep(j,prop[j]))
cool=NULL
for (j in 1:(cand-N))
cool=c(cool,rep(100+j,los[j]))
cool=sample(cool) #random permutation
for (a in 1:39){
same=((a+1):40)[pool[(a+1):40]==pool[a]]
if (length(same)>0){
stoq=max(cool[same]==cool[a])
if (stoq==1) break()
}
}
if (stoq==0) break()
}


which does not return a positive answer to the above question. (And does not require simulations from contingency tables with fixed margins!)

## Two local recipients for the Savage award!

Posted in Statistics, University life with tags , , , , on April 4, 2011 by xi'an

Two Paris statisticians are recipients of the Savage award this year: Julien Cornebise (PhD from Telecom-Paristech with Eric Moulines, now at UCL in Mark Girolami’s group) is nominated for the Theory award and Robin Ryder (PhD in Oxford with Geoff Nicholls, now at CREST) is nominated for the Applied methodology award. Congratulations to both (and to the other two recipients) for well-deserved rewards! (Past local recipients were Billy Amzal in 2005 and Nicolas Chopin in 2002.)

## Le Monde rank test

Posted in R, Statistics with tags , , , , , , , , on April 5, 2010 by xi'an

In the puzzle found in Le Monde of this weekend, the mathematical object behind the silly story is defined as a pseudo-Spearman rank correlation test statistic,

$\mathfrak{M}_n = \sum_{i=1}^n |r^x_i-r^y_i|\,,$

where the difference between the ranks of the paired random variables $x_i$ and $y_i$ is in absolute value instead of being squared as in the Spearman rank test statistic. I don’t know whether or not this measure of distance has been studied in the statistics literature (although I’d be surprised has it not been studied!). Here is an histogram of the distribution of the new statistics for $n=20$ under the null hypothesis that both samples are uncorrelated (i.e. that the sequence of ranks is a random permutation). Each point in the sample was obtained by

perm=sample(1:20)
saple[t]=sum(abs(perm[1:10]-perm[11:20]))

When regressing the mean of this statistic $\mathfrak{M}_n$ against the covariates $n$ and $n^2$, I obtain the uninspiring formula

$\mathbb{E} [\mathfrak{M}_n] \approx 0.1681 n^2 - 0.3769 n + 11.1921$

which does not translate into a nice polynomial in $n$!

Another interesting probabilistic/combinatorial problem issued from an earlier Le Monde puzzle: given an urn with $n$ white balls and $n$ black balls that is sampled without replacement, what is the probability that there exists a sequence of length $2k$ with the same number of white and black balls for $k=1,\ldots,n$? If $k=1,n$, the answer is obviously one (1), but for some values of $k$, it is less than one. When $n$ goes to infinity, this is somehow related to the probability that a Brownian bridge crosses the axis in-between $0$ and $1$ but I have no clue whether this helps or not! Robin Ryder solved the question for the values $n=50$ and $k=24,25$ by establishing that the probability is still one.

Ps- The same math tribune in Le Monde coincidently advertises a book, Le Mythe Climatique, by Benoît Rittaud that adresses … climate change issues and the “statistical mistakes made by climatologists”. The interesting point (if any) is that Benoît Rittaud is a “mathematician not a statistician”, with a few papers in ergodic theory, but this advocated climatoskeptic nonetheless criticises the use of both statistical and simulation tools in climate modeling. (“Simulation has only been around for a few dozen years, a very short span in the history of sciences. The climate debate may be an opportunity to reassess the role of simulation in the scientific process.”)