## Feller’s shoes and Rasmus’ socks [well, Karl’s actually…]

**Y**esterday, Rasmus Bååth [of puppies’ fame!] posted a very nice blog using ABC to derive the posterior distribution of the total number of socks in the laundry when only pulling out orphan socks and no pair at all in the first eleven draws. Maybe not the most pressing issue for Bayesian inference in the era of Big data but still a challenge of sorts!

Rasmus set a prior on the total number m of socks, a negative Binomial Neg(15,1/3) distribution, and another prior of the proportion of socks that come by pairs, a Beta B(15,2) distribution, then simulated pseudo-data by picking eleven socks at random, and at last applied ABC (in Rubin’s 1984 sense) by waiting for the observed event, i.e. only orphans and no pair [of socks]. Brilliant!

The overall simplicity of the problem set me wondering about an alternative solution using the likelihood. Cannot be that hard, can it?! After a few computations rejected by opposing them to experimental frequencies, I put the problem on hold until I was back home and with access to my Feller volume 1, one of the few [math] books I keep at home… As I was convinced one of the exercises in Chapter II would cover this case. After checking, I found a partial solution, namely Exercice 26:

A closet contains n pairs of shoes. If 2r shoes are chosen at random (with 2r<n), what is the probability that there will be (a) no complete pair, (b) exactly one complete pair, (c) exactly two complete pairs among them?

This is not exactly a solution, but rather a problem, however it leads to the value

as the probability of obtaining j pairs among those 2r shoes. Which also works for an odd number t of shoes:

as I checked against my large simulations. So I solved Exercise 26 in Feller volume 1 (!), but not Rasmus’ problem, since there are those orphan socks on top of the pairs. If one draws 11 socks out of m socks made of f orphans and g pairs, with f+2g=m, the number k of socks from the orphan group is an hypergeometric H(11,m,f) rv and the probability to observe 11 orphan socks total (either from the orphan or from the paired groups) is thus the marginal over all possible values of k:

so it could be argued that we are facing a closed-form likelihood problem. Even though it presumably took me longer to achieve this formula than for Rasmus to run his exact ABC code!

November 18, 2014 at 5:35 pm

[…] Bååth presented a Bayesian analysis by using ABC. Later Professor Christian Robert presented exact ptobability calculations pointing out that Feller had posed a similar problem. And here is another post from Professor […]

October 27, 2014 at 1:19 pm

Nice! This reminds me http://arxiv.org/abs/1211.6486 .

October 27, 2014 at 1:33 pm

Thanks for the reference, not a pedestrian paper, far from it!!!

October 27, 2014 at 1:41 pm

Given the current interest in garment related computation I believe the time is ripe for “The Journal of Statistical Analysis of Clothing and General Apparel.”

October 27, 2014 at 1:53 pm

Imagine this becomes a “hot” topic… Then we are running the risk it gets too smelly to be comfortable!

October 24, 2014 at 4:30 pm

Ha! Great! Thanks again for blogging about this. And thanks to Karl for tweeting the problem. I think the problem is really neat. It involves data, it is simple to explain, but requires some thought (and informative priors) to result in reasonable inferences. The German tank problem for the 21st century :)

October 24, 2014 at 2:56 pm

Some typos: “over all possible values of f” should be “over all possible values of k” and $\binom{25}{11-k}$ in the denominator of the last expression should be $\binom{2g}{11-k}$.

October 24, 2014 at 3:24 pm

Thanks, even 25 looks like 2g or at least 2G from far away…!

November 5, 2014 at 10:03 pm

I implement your close-form equation in R:

closeForm1 <- function(s,p,k) {

sum <- 0

# s is number of singleton socks

# p is number of sock pairs

# m is total number of socks

m = s + 2 * p

# k is the select unique socks

for (i in 0:k) {

sum <- sum + choose(s,i) * choose(2*p,k-i) / choose(m,k)

sum closeForm1(s=3,p=21,k=11)

[1] 0.009357792

which is different from the result of Śaunak Sen @ http://t.co/HhS3K45Pc5:

> norepeat2(n1=3,n2=21,k=11)

[1] 0.2275212

November 5, 2014 at 10:15 pm

Interesting question and reference, as I compared this formula of mine with Rasmus’ ABC output and obtained a very neat fit. Don’t you miss the 2

^{k-i}in your numerator? Thanks, Hong!