Archive for William Feller

Feller’s shoes and Rasmus’ socks [well, Karl’s actually…]

Posted in Books, Kids, R, Statistics, University life with tags , , , , on October 24, 2014 by xi'an

Yesterday, Rasmus Bååth [of puppies’ fame!] posted a very nice blog using ABC to derive the posterior distribution of the total number of socks in the laundry when only pulling out orphan socks and no pair at all in the first eleven draws. Maybe not the most pressing issue for Bayesian inference in the era of Big data but still a challenge of sorts!

Rasmus set a prior on the total number m of socks, a negative Binomial Neg(15,1/3) distribution, and another prior of the proportion of socks that come by pairs, a Beta B(15,2) distribution, then simulated pseudo-data by picking eleven socks at random, and at last applied ABC (in Rubin’s 1984 sense) by waiting for the observed event, i.e. only orphans and no pair [of socks]. Brilliant!

The overall simplicity of the problem set me wondering about an alternative solution using the likelihood. Cannot be that hard, can it?! After a few computations rejected by opposing them to experimental frequencies, I put the problem on hold until I was back home and with access to my Feller volume 1, one of the few [math] books I keep at home… As I was convinced one of the exercises in Chapter II would cover this case. After checking, I found a partial solution, namely Exercice 26:

A closet contains n pairs of shoes. If 2r shoes are chosen at random (with 2r<n), what is the probability that there will be (a) no complete pair, (b) exactly one complete pair, (c) exactly two complete pairs among them?

This is not exactly a solution, but rather a problem, however it leads to the value

p_j=\binom{n}{j}2^{2r-2j}\binom{n-j}{2r-2j}\Big/\binom{2n}{2r}

as the probability of obtaining j pairs among those 2r shoes. Which also works for an odd number t of shoes:

p_j=2^{t-2j}\binom{n}{j}\binom{n-j}{t-2j}\Big/\binom{2n}{t}

as I checked against my large simulations. socksSo I solved Exercise 26 in Feller volume 1 (!), but not Rasmus’ problem, since there are those orphan socks on top of the pairs. If one draws 11 socks out of m socks made of f orphans and g pairs, with f+2g=m, the number k of socks from the orphan group is an hypergeometric H(11,m,f) rv and the probability to observe 11 orphan socks total (either from the orphan or from the paired groups) is thus the marginal over all possible values of k:

\sum_{k=0}^{11} \dfrac{\binom{f}{k}\binom{2g}{11-k}}{\binom{m}{11}}\times\dfrac{2^{11-k}\binom{g}{11-k}}{\binom{2g}{11-k}}

so it could be argued that we are facing a closed-form likelihood problem. Even though it presumably took me longer to achieve this formula than for Rasmus to run his exact ABC code!

the anti-Bayesian moment and its passing online

Posted in Statistics, University life with tags , , on March 8, 2013 by xi'an

Our rejoinder “the anti-Bayesian moment and its passing” with Andrew Gelman has now been put online on the webpage of The American Statistician. While this rejoinder is freely available, the paper that generated the discussion and this rejoinder, ““Not Only Defended But Also Applied”: The Perceived Absurdity of Bayesian Inference” is only available to subscribers to The American Statistician. Or through arXiv.

the anti-Bayesian moment and its passing

Posted in Books, Statistics, University life with tags , , , , , , , , on October 30, 2012 by xi'an

Today, our reply to the discussion of our American Statistician paper “Not only defended but also applied” by Stephen Fienberg, Wes Johnson, Deborah Mayo, and Stephen Stiegler,, was posted on arXiv. It is kind of funny that this happens the day I am visiting Iowa State University Statistics Department, a department that was formerly a Fisherian and thus anti-Bayesian stronghold. (Not any longer, to be sure! I was also surprised to discover that before the creation of the department, Henry Wallace, came to lecture on machine calculations for statistical methods…in 1924!)

The reply to the discussion was rewritten and much broadened by Andrew after I drafted a more classical point-by-point reply to our four discussants, much to its improvement. For one thing, it reads well on its own, as the discussions are not yet available on-line. For another, it gives a broader impact of the discussion, which suits well the readership of The American Statistician. (Some of my draft reply is recycled in this post.)

Continue reading

not only defended but also applied [to appear]

Posted in Books, Statistics, University life with tags , , , , , , , on June 12, 2012 by xi'an

Our paper with Andrew Gelman, “Not only defended but also applied”: the perceived absurdity of Bayesian inference, has been reviewed for the second time and is to appear in The American Statistician, as a discussion paper. Terrific news! This is my first discussion paper in The American Statistician (and the second in total, the first one being the re-read of JeffreysTheory of Probability.) [The updated version is now on arXiv.]

not only defended but also applied (rev’d)

Posted in Statistics with tags , , , on April 16, 2012 by xi'an

Following a very positive and encouraging review by The American Statistician of our paper with Andrew Gelman on Feller’s misrepresentation of Bayesian statistics in the otherwise superb Introduction to Probability Theory , we have submited a revised version, now posted on arXiv. Hopefully, we will be able to publish this historic-philosophical note in The American Statistician, and maybe even get a discussion paper on the issue of misconceptions on Bayesian analysis.

the birthday problem [X’idated]

Posted in R, Statistics, University life with tags , , , on February 1, 2012 by xi'an

The birthday problem (i.e. looking at the distribution of the birthdates in a group of n persons, assuming [wrongly] a uniform distribution of the calendar dates of those birthdates) is always a source of puzzlement [for me]! For instance, here is a recent post on Cross Validated:

I have 360 friends on facebook, and, as expected, the distribution of their birthdays is not uniform at all. I have one day with that has 9 friends with the same birthday. So, given that some days are more likely for a birthday, I’m assuming the number of 23 is an upperbound.

The figure 9 sounded unlikely, so I ran the following computation:

extreme=rep(0,360)
for (t in 1:10^5){
  i=max(diff((1:360)[!duplicated(sort(sample(1:365,360,rep=TRUE)))]))
  extreme[i]=extreme[i]+1
  }
extreme=extreme/10^5
barplot(extreme,xlim=c(0,30),names=1:360)

whose output shown on the above graph. (Actually, I must confess I first forgot the sort in the code, which led me to then believe that 9 was one of the most likely values and post it on Cross Validated! The error was eventually picked by one administrator. I should know better than trust my own R code!) According to this simulation, observing 9 or more people having the same birthdate has an approximate probability of 0.00032… Indeed, fairly unlikely!

Incidentally, this question led me to uncover how to print the above on this webpage. And to learn from the X’idated moderator whuber the use of tabulate. Which avoids the above loop:

> system.time(test(10^5)) #my code above
user  system elapsed
26.230   0.028  26.411
> system.time(table(replicate(10^5, max(tabulate(sample(1:365,360,rep=TRUE))))))
user  system elapsed
5.708   0.044   5.762

[weak] information paradox

Posted in pictures, Running, Statistics, University life with tags , , , , , , on December 2, 2011 by xi'an

While (still!) looking at questions on Cross Validated on Saturday morning, just before going out for a chilly run in the park, I noticed an interesting question about a light bulb problem. Once you get the story out of the way, it boils down to the fact that, when comparing two binomial probabilities, p1 and p2, based on a Bernoulli sample of size n, and when selecting the MAP probability, having either n=2k-1 or n=2k observations lead to the same (frequentist) probability of making the right choice. The details are provided in my answers here and there. It is a rather simple combinatoric proof, once you have the starting identity [W. Feller, An Introduction to Probability Theory and Its Applications, vol. 1, 1968, [II.8], eqn (8.6)]

{2k-1 \choose i-1} + {2k-1 \choose i} = {2k \choose i}

but I wonder if there exists a more statistical explanation to this weak information paradox…