That the likelihood principle does not hold…

Coming to Section III in Chapter Seven of Error and Inference, written by Deborah Mayo, I discovered that she considers that the likelihood principle does not hold (at least as a logical consequence of the combination of the sufficiency and of the conditionality principles), thus that Allan Birnbaum was wrong…. As well as the dozens of people working on the likelihood principle after him! Including Jim Berger and Robert Wolpert [whose book sells for $214 on amazon!, I hope the authors get a hefty chunk of that ripper!!! Esp. when it is available for free on project Euclid…] I had not heard of (nor seen) this argument previously, even though it has apparently created enough of a bit of a stir around the likelihood principle page on Wikipedia. It does not seem the result is published anywhere but in the book, and I doubt it would get past a review process in a statistics journal. [Judging from a serious conversation in Zürich this morning, I may however be wrong!]

The core of Birnbaum’s proof is relatively simple: given two experiments E¹ and E² about the same parameter θ with different sampling distributions f¹ and f², such that there exists a pair of outcomes (y¹,y²) from those experiments with proportional likelihoods, i.e. as a function of θ

$f^1(y^1|\theta) = c f^2(y^2|\theta),$

one considers the mixture experiment E⁰ where E¹ and E² are each chosen with probability ½. Then it is possible to build a sufficient statistic T that is equal to the data (j,x), except when j=2 and x=y², in which case T(j,x)=(1,y¹). This statistic is sufficient since the distribution of (j,x) given T(j,x) is either a Dirac mass or a distribution on {(1,y¹),(2,y²)} that only depends on c. Thus it does not depend on the parameter θ. According to the weak conditionality principle, statistical evidence, meaning the whole range of inferences possible on θ and being denoted by Ev(E,z), should satisfy

$Ev(E^0, (j,x)) = Ev(E^j,x)$

Because the sufficiency principle states that

$Ev(E^0, (j,x)) = Ev(E^0,T(j,x))$

this leads to the likelihood principle

$Ev(E^1,y^1)=Ev(E^0, (j,y^j)) = Ev(E^2,y^2)$

(See, e.g., The Bayesian Choice, pp. 18-29.) Now, Mayo argues this is wrong because

“The inference from the outcome (E^j,y^j) computed using the sampling distribution of [the mixed experiment] E⁰ is appropriately identified with an inference from outcome y^j based on the sampling distribution of E^j, which is clearly false.” (p.310)

This sounds to me like a direct rejection of the conditionality principle, so I do not understand the point. (A formal rendering in Section 5 using the logic formalism of A’s and Not-A’s reinforces my feeling that the conditionality principle is the one criticised and misunderstood.) If Mayo’s frequentist stance leads her to take the sampling distribution into account at all times, this is fine within her framework. But I do not see how this argument contributes to invalidate Birnbaum’s proof. The following and last sentence of the argument may bring some light on the reason why Mayo considers it does:

“The sampling distribution to arrive at Ev(E⁰,(j,y^j)) would be the convex combination averaged over the two ways that y^j could have occurred. This differs from the sampling distributions of both Ev(E¹,y¹) and Ev(E²,y²).” (p.310)

Indeed, and rather obviously, the sampling distribution of the evidence Ev(E^*,z^*) will differ depending on the experiment. But this is not what is stated by the likelihood principle, which is that the inference itself should be the same for y¹ and y². Not the distribution of this inference. This confusion between inference and its assessment is reproduced in the “Explicit Counterexample” section, where p-values are computed and found to differ for various conditional versions of a mixed experiment. Again, not a reason for invalidating the likelihood principle. So, in the end, I remain fully unconvinced by this demonstration that Birnbaum was wrong. (If in a bystander’s agreement with the fact that frequentist inference can be built conditional on ancillary statistics.)

This entry was posted on October 6, 2011 at 12:11 am and is filed under Statistics, University life with tags Allan Birnbaum, ancillary statistics, book review, Deborah Mayo, Error and Inference, IMS Monographs, Jim Berger, Robert Wolpert, Sufficiency principle, The Likelihood Principle, weak conditionality principle. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

29 Responses to “That the likelihood principle does not hold…”

Don’t Birnbaumize that experiment my friend*–updated reblog | Error Statistics Philosophy Says:
September 10, 2014 at 6:55 am

[…] or Christian Robert ; I cannot vouchsafe for Robert, unless he has revised his first impression in his October 6, 2011 blog (as I hope he has). For in that blog post Robert […]

Reply
On the Likelihood Principle | Blog Pra falar de coisas Says:
August 5, 2013 at 7:13 pm

[…] should define evidence, but it happens (or so It seems, see here) that the term is vague even for statisticians, which is rather […]

Reply
Carlos Cinelli Says:
January 4, 2013 at 4:37 pm

Robert,

I don’t understand the logic of this example.

You say:

“Then it is possible to build a sufficient statistic T that is equal to the data (j,x), except when j=2 and x=y², in which case T(j,x)=(1,y¹).”

How is that possible?

If j=2 and x=y² that means that you did not perform experiment 1, right?

So, if you did not perform experiment 1, how can you know the value of (1,y¹)?

Maybe I’m missing something here.

Reply
- Carlos Cinelli Says:
  January 4, 2013 at 8:15 pm
  
  Nevermind.
  
  I thought that superscripts 1 and 2 in (y¹,y²) were used to discriminate any results of either E¹ or E².
  
  But reading it again I realised that (y¹,y²) refers only to the pair in which the likelihoods are proportional.
  
  Reply
  - xi'an Says:
    January 5, 2013 at 5:09 pm
    
    Yes, indeed (sorry for the delay, I am not connected most of the time…)
Don’t Birnbaumize that Experiment my Friend* « Error Statistics Philosophy Says:
February 4, 2012 at 2:58 am

[…] or Christian Robert ; I cannot vouchsafe for Robert, unless he has revised his first impression in his October 6, 2011 blog (as I hope he has). For in that blog post Robert says “If Mayo’s frequentist stance leads her […]

Reply
The 3 stages of the acceptance of novel truths « Error Statistics Philosophy Says:
January 1, 2012 at 2:51 am

[…] But it is time to make good on my promise to return to concerns of those (at least in the blogosphere), who were or are still at the first stage of denial (or Schopenhauer’s second stage of violent opposition). Doing so will advance our goal of drilling deeply into some fundamental, puzzling misunderstandings of frequentist error statistical (or sampling) theory. Consider Christian Robert’s October 6, 2011 post: “Coming to Section III in Chapter Seven of Error and Inference, written by Deborah Mayo, I discovered that she considers that the likelihood principle does not hold (at least as a logical consequence of the combination of the sufficiency and of the conditionality principles), thus that Allan Birnbaum was wrong…. As well as the dozens of people working on the likelihood principle after him! …I had not heard of (nor seen) this argument previously, even though it has apparently created enough of … a stir around the likelihood principle page on Wikipedia. It does not seem the result is published anywhere but in the book, and I doubt it would get past a review process in a statistics journal.” https://xianblog.wordpress.com/2011/10/06/that-the-likelihood-principle-does-not-hold/ […]

Reply

Xi'an's Og