## Rasmus’ socks fit perfectly!

Posted in Books, Kids, R, Statistics, University life with tags , , , , on November 10, 2014 by xi'an

Following the previous post on Rasmus’ socks, I took the opportunity of a survey on ABC I am currently completing to compare the outcome of his R code with my analytical derivation. After one quick correction [by Rasmus] of a wrong representation of the Negative Binomial mean-variance parametrisation [by me], I achieved this nice fit…

## Feller’s shoes and Rasmus’ socks [well, Karl’s actually…]

Posted in Books, Kids, R, Statistics, University life with tags , , , , on October 24, 2014 by xi'an

Yesterday, Rasmus Bååth [of puppies’ fame!] posted a very nice blog using ABC to derive the posterior distribution of the total number of socks in the laundry when only pulling out orphan socks and no pair at all in the first eleven draws. Maybe not the most pressing issue for Bayesian inference in the era of Big data but still a challenge of sorts!

Rasmus set a prior on the total number m of socks, a negative Binomial Neg(15,1/3) distribution, and another prior of the proportion of socks that come by pairs, a Beta B(15,2) distribution, then simulated pseudo-data by picking eleven socks at random, and at last applied ABC (in Rubin’s 1984 sense) by waiting for the observed event, i.e. only orphans and no pair [of socks]. Brilliant!

The overall simplicity of the problem set me wondering about an alternative solution using the likelihood. Cannot be that hard, can it?! After a few computations rejected by opposing them to experimental frequencies, I put the problem on hold until I was back home and with access to my Feller volume 1, one of the few [math] books I keep at home… As I was convinced one of the exercises in Chapter II would cover this case. After checking, I found a partial solution, namely Exercice 26:

A closet contains n pairs of shoes. If 2r shoes are chosen at random (with 2r<n), what is the probability that there will be (a) no complete pair, (b) exactly one complete pair, (c) exactly two complete pairs among them?

This is not exactly a solution, but rather a problem, however it leads to the value

$p_j=\binom{n}{j}2^{2r-2j}\binom{n-j}{2r-2j}\Big/\binom{2n}{2r}$

as the probability of obtaining j pairs among those 2r shoes. Which also works for an odd number t of shoes:

$p_j=2^{t-2j}\binom{n}{j}\binom{n-j}{t-2j}\Big/\binom{2n}{t}$

as I checked against my large simulations. So I solved Exercise 26 in Feller volume 1 (!), but not Rasmus’ problem, since there are those orphan socks on top of the pairs. If one draws 11 socks out of m socks made of f orphans and g pairs, with f+2g=m, the number k of socks from the orphan group is an hypergeometric H(11,m,f) rv and the probability to observe 11 orphan socks total (either from the orphan or from the paired groups) is thus the marginal over all possible values of k:

$\sum_{k=0}^{11} \dfrac{\binom{f}{k}\binom{2g}{11-k}}{\binom{m}{11}}\times\dfrac{2^{11-k}\binom{g}{11-k}}{\binom{2g}{11-k}}$

so it could be argued that we are facing a closed-form likelihood problem. Even though it presumably took me longer to achieve this formula than for Rasmus to run his exact ABC code!

## The Unimaginable Mathematics of Borges’ Library of Babel [book review]

Posted in Statistics, University life, Books, Travel with tags , , , , , , , , , , on September 30, 2014 by xi'an

This is a book I carried away from JSM in Boston as the Oxford University Press representative kindly provided my with a copy at the end of the meeting. After I asked for it, as I was quite excited to see a book linking Jorge Luis Borges’ great Library of Babel short story with mathematical concepts. Even though many other short stories by Borges have a mathematical flavour and are bound to fascinate mathematicians, the Library of Babel is particularly prone to mathemati-sation as it deals with the notions of infinite, periodicity, permutation, randomness… As it happens, William Goldbloom Bloch [a patronym that would surely have inspired Borges!], professor of mathematics at Wheaton College, Mass., published the unimaginable mathematics of Borges’ Library of Babel in 2008, so this is not a recent publication. But I had managed to miss through the several conferences where I stopped at OUP exhibit booth. (Interestingly William Bloch has also published a mathematical paper on Neil Stephenson’s Cryptonomicon.)

Now, what is unimaginable in the maths behind Borges’ great Library of Babel??? The obvious line of entry to the mathematical aspects of the book is combinatorics: how many different books are there in total? [Ans. 10¹⁸³⁴⁰⁹⁷…] how many hexagons are needed to shelf that many books? [Ans. 10⁶⁸¹⁵³¹…] how long would it take to visit all those hexagons? how many librarians are needed for a Library containing all volumes once and only once? how many different libraries are there [Ans. 1010⁶…] Then the book embarks upon some cohomology, Cavalieri’s infinitesimals (mentioned by Borges in a footnote), Zeno’s paradox, topology (with Klein’s bottle), graph theory (and the important question as to whether or not each hexagon has one or two stairs), information theory, Turing’s machine. The concluding chapters are comments about other mathematical analysis of Borges’ Grand Œuvre and a discussion on how much maths Borges knew.

So a nice escapade through some mathematical landscapes with more or less connection with the original masterpiece. I am not convinced it brings any further dimension or insight about it, or even that one should try to dissect it that way, because it kills the poetry in the story, especially the play around the notion(s) of infinite. The fact that the short story is incomplete [and short on details] makes its beauty: if one starts wondering at the possibility of the Library or at the daily life of the librarians [like, what do they eat? why are they there? where are the readers? what happens when they die? &tc.] the intrusion of realism closes the enchantment! Nonetheless, the unimaginable mathematics of Borges’ Library of Babel provides a pleasant entry into some mathematical concepts and as such may initiate a layperson not too shy of maths formulas to the beauty of mathematics.

## Le Monde puzzle [#879]

Posted in Kids, Books with tags , , on September 21, 2014 by xi'an

Here is the last week puzzle posted in Le Monde:

Given an alphabet with 26 symbols, is it possible to create 27 different three-symbol words such that

1. all symbols within a word are different
2. all triplets of symbols are different
3. there is no pair of words with a single common symbol

Since there are

28x27x26/3×2=2925

such three-symbol words, it could be feasible to write an R code that builds the 27-uplet, assuming it exists. However, by breaking those words into primary words [that share no common symbols] and secondary words [that share two symbols with one primary word], it seems to me that there can be a maximum of 26 words under those three rules…

## Le Monde puzzle [#853]

Posted in Books, Kids, Statistics with tags , , , on February 13, 2014 by xi'an

Yet another one of those Le Monde mathematical puzzles which wording is confusing (or at least annoying) to me:

A political party has 11 commissions, to which belong some of the 13 members of the central committee. A token is given to each member for each commission whom he or she belongs. Two different members cannot share more than one common commission. How many tokens at most? Same question if the president belongs to five commissions.

I just dislike the “story” around the combinatoric problem. Given 13 sets and 11 letters, how many letters can one allocate to each set so that each pair of sets share at most one letter? While waiting for my prosthesis this afternoon, I thought of a purely random search, using the following R code:

n=11
m=13
best=0
for (t in 1:10^3){
token=matrix(-1,ncol=n,nrow=m);diag(token)=1
#fill at random: -1 unchecked, 1 allocated, 0 prohibited
while (sum(token<0)>0){
res=TRUE
while (res){
if (sum(token<0)==1){
dex=(1:(n*m))[token[(1:(n*m))]<0]}else{
dex=sample((1:(n*m))[token[(1:(n*m))]<0],1)}
i=dex%%m+m*(dex%%m==0)
j=trunc(dex/(1.01*m))+1
res=FALSE
for (k in (1:m)[token[,j]==1])
res=res||(max(token[i,-j]+token[k,-j])==2)
token[dex]=0
if (sum(token<0)==0) res=FALSE
}
token[dex]=1*(sum(token<0)>0)
for (k in (1:m)[-i])
if (max(token[i,-j]+token[k,-j])==2) token[k,j]=0
image(1:13,1:11,token)
}
if (sum(token[token==1])>best) bestoken=token
best=max(best,sum(token))
}


which led to the value of 43 after that many iterations. Longer runs on faster machines at Paris-Dauphine did not change the output but, as usual with brute force simulation, the true solution may be such an extreme that it is extremely unlikely to happen… I then tried a standard simulated annealing exploration, but could not find an higher value. Except once, leading to the value 44. Here is the corresponding allocation of letters (committees) to sets (individuals) for that solution.In this Feb. 5, 2014, issue of Le Monde Science&Médecine leaflet, a review of (my Warwick colleague) Ian Stewart’s 17 equations that changed the World, which must have been recently translated in French (with no criticism of the less compelling chapter on Black-Scholes, and a confusion of the “bell curve” with statistics). Yet another tribune of Marco Zito about the generalisation of Richard Feynman’s diagrams to a solid called the Amplituhedron. (Not making much sense as exposed!)

## Le Monde puzzle [#822]

Posted in Books, Kids, R with tags , , , , , , on June 10, 2013 by xi'an

For once Le Monde math puzzle is much more easily solved on a piece of paper than in R, even in a plane from Roma:

Given a partition of the set {1,…,N} in k groups, one considers the collection of all subsets of  the set {1,…,N} containing at least one element from each group. Show that the size of the collection cannot be 50.

Obviously, one could consider a range of possible N’s and k’s and run a program evaluating the sizes of the corresponding collections. However, if the k groups are of size n1,…,nk, the number of subsets satisfying the condition is

$(2^{n_1}-1)\times \ldots \times (2^{n_k}-1)$

and it is easily shown by induction that this number is necessarily odd, hence the impossible 50.

## Le Monde puzzle [#783]

Posted in R, Travel, University life with tags , , , on July 21, 2012 by xi'an

In a political party, there are as many cells as there are members and each member belongs to at least one cell. Each cell has five members and an arbitrary pair of cells only shares one member. How many members are there in this political party?

Back to the mathematical puzzles of Le Monde (science leaflet of the weekend edition)! In addition to a tribune by Cédric Villani celebrating the 100th anniversary of the death of Henri Poincaré, this issue [now of last week] offers this interesting challenge. So much interesting that I could only solve it for three (instead of five) members and could not see the mathematical notion behind the puzzle…

Let us denote by n the number of both the cells and the number of members. Then, when picking an arbitrary order on the sets, if ij denotes the number of members in set j already seen in sets with lower indices, we have the following equality on the total number of members

$n = 5n -i_2-\cdots-i_n$

and the constraints are that i2<2, i3<3, i4<4, i5<5, and ij<6, for j>5. Hence, i2+i3+i4+i5+…+in≤5n-15, which implies n≥15.

Now, in terms of analytics, I could not go much further and thus turned to an R code to see if I could find a solution by brute force. Here is my code (where the argument a is the number of elements in each set): Continue reading