## Feller’s shoes and Rasmus’ socks [well, Karl’s actually…]

Yesterday, Rasmus Bååth [of puppies’ fame!] posted a very nice blog using ABC to derive the posterior distribution of the total number of socks in the laundry when only pulling out orphan socks and no pair at all in the first eleven draws. Maybe not the most pressing issue for Bayesian inference in the era of Big data but still a challenge of sorts!

Rasmus set a prior on the total number m of socks, a negative Binomial Neg(15,1/3) distribution, and another prior of the proportion of socks that come by pairs, a Beta B(15,2) distribution, then simulated pseudo-data by picking eleven socks at random, and at last applied ABC (in Rubin’s 1984 sense) by waiting for the observed event, i.e. only orphans and no pair [of socks]. Brilliant!

The overall simplicity of the problem set me wondering about an alternative solution using the likelihood. Cannot be that hard, can it?! After a few computations rejected by opposing them to experimental frequencies, I put the problem on hold until I was back home and with access to my Feller volume 1, one of the few [math] books I keep at home… As I was convinced one of the exercises in Chapter II would cover this case. After checking, I found a partial solution, namely Exercice 26:

A closet contains n pairs of shoes. If 2r shoes are chosen at random (with 2r<n), what is the probability that there will be (a) no complete pair, (b) exactly one complete pair, (c) exactly two complete pairs among them?

This is not exactly a solution, but rather a problem, however it leads to the value

$p_j=\binom{n}{j}2^{2r-2j}\binom{n-j}{2r-2j}\Big/\binom{2n}{2r}$

as the probability of obtaining j pairs among those 2r shoes. Which also works for an odd number t of shoes:

$p_j=2^{t-2j}\binom{n}{j}\binom{n-j}{t-2j}\Big/\binom{2n}{t}$

as I checked against my large simulations. So I solved Exercise 26 in Feller volume 1 (!), but not Rasmus’ problem, since there are those orphan socks on top of the pairs. If one draws 11 socks out of m socks made of f orphans and g pairs, with f+2g=m, the number k of socks from the orphan group is an hypergeometric H(11,m,f) rv and the probability to observe 11 orphan socks total (either from the orphan or from the paired groups) is thus the marginal over all possible values of k:

$\sum_{k=0}^{11} \dfrac{\binom{f}{k}\binom{2g}{11-k}}{\binom{m}{11}}\times\dfrac{2^{11-k}\binom{g}{11-k}}{\binom{2g}{11-k}}$

so it could be argued that we are facing a closed-form likelihood problem. Even though it presumably took me longer to achieve this formula than for Rasmus to run his exact ABC code!

### 10 Responses to “Feller’s shoes and Rasmus’ socks [well, Karl’s actually…]”

1. […] Bååth presented a Bayesian analysis by using ABC. Later Professor Christian Robert presented exact ptobability calculations pointing out that Feller had posed a similar problem. And here is another post from Professor […]

2. Nice! This reminds me http://arxiv.org/abs/1211.6486 .

• Thanks for the reference, not a pedestrian paper, far from it!!!

• Given the current interest in garment related computation I believe the time is ripe for “The Journal of Statistical Analysis of Clothing and General Apparel.”

• Imagine this becomes a “hot” topic… Then we are running the risk it gets too smelly to be comfortable!

3. Ha! Great! Thanks again for blogging about this. And thanks to Karl for tweeting the problem. I think the problem is really neat. It involves data, it is simple to explain, but requires some thought (and informative priors) to result in reasonable inferences. The German tank problem for the 21st century :)

4. Some typos: “over all possible values of f” should be “over all possible values of k” and $\binom{25}{11-k}$ in the denominator of the last expression should be $\binom{2g}{11-k}$.

• Thanks, even 25 looks like 2g or at least 2G from far away…!

• I implement your close-form equation in R:

closeForm1 <- function(s,p,k) {
sum <- 0

# s is number of singleton socks
# p is number of sock pairs
# m is total number of socks
m = s + 2 * p

# k is the select unique socks
for (i in 0:k) {
sum <- sum + choose(s,i) * choose(2*p,k-i) / choose(m,k)
sum closeForm1(s=3,p=21,k=11)
[1] 0.009357792

which is different from the result of Śaunak Sen @ http://t.co/HhS3K45Pc5:

> norepeat2(n1=3,n2=21,k=11)
[1] 0.2275212

• Interesting question and reference, as I compared this formula of mine with Rasmus’ ABC output and obtained a very neat fit. Don’t you miss the 2k-i in your numerator? Thanks, Hong!