the likelihood principle (sequel)
As mentioned in my review of Paradoxes in Scientific Inference I was a bit confused by this presentation of the likelihood principle and this led me to ponder for a week or so whether or not there was an issue with Birnbaum’s proof (or, much more likely, with my vision of it!). After reading again Birnbaum’s proof, while sitting down in a quiet room at ICERM for a little while, I do not see any reason to doubt it. (Keep reading at your own risk!)
My confusion was caused by mixing sufficiency in the sense of Birnbaum’s mixed experiment with sufficiency in the sense of our ABC model choice PNAS paper, namely that sufficient statistics are not always sufficient to select the right model. The sufficient statistics in the proof reduces the (2,x2) observation from Model 2 to (1,x1) from Model 1 when there is an observation x1 that produces a likelihood proportional to the likelihood for x2 and the statistic is indeed sufficient: the distribution of (2,x2) given (1,x1) does not depend on the parameter θ. Of course, the statistic is not sufficient (most of the time) for deciding between Model 1 and Model 2, but this model choice issue is foreign to Birnbaum’s construction.
December 3, 2012 at 2:08 pm
The full Bayes Theorem should read:
P(H|D,K)=P(D|H,K)P(H|K)/P(D|K)
Where D=data, H=params, K=all other knowledge
If the likelihood principle holds in any particular instance, it will just fall out of this equation. If it doesn’t hold in given example, that will just fall out of this equation as well. So I don’t see why a Bayesian need pay any attention the the Birnbaum’s proof or any version of the Likelihood principle.
It seems to be an issue that Frequentists get hot and bothered about, but that Bayesian can safely ignore in principle and in practice.
December 1, 2012 at 5:01 pm
To see what’s wrong with Birnbaum’s argument (as I think Birnbaum himself realized) see:
The premises, taken together, require that the evidential import of a result known to have arisen from E’ both should and should not be influenced by an unperformed experiment E”. If you’re using sampling distributions in Bayesian inference, then you should be glad the Birnbaum argument is unsound. My criticism, I think, can be extended to apply to the Bayesian formulation, although I have not done so.
December 1, 2012 at 1:18 am
X:
You could do that, but, even so, the sampling distribution can depend on information that is not included in the likelihood function.
December 2, 2012 at 11:19 pm
Yes, that is the denial of the likelihood principle and thus it’s important to see why Birnbaum’s attempt fails (as he realized).
November 30, 2012 at 1:54 am
X,
This seems related to our point in chapter 6 of BDA that you need to know the sampling distribution (not just the likelihood) to do a posterior predictive check. For example, the stopping rule is relevant even if it only depends on observed data.
November 30, 2012 at 3:43 pm
A,
The predictive check could be restricted to a replica of the sufficient statistic, no?!