## why is the likelihood not a pdf?

The return of an old debate on X validated. Can the likelihood be a pdf?! Even though there exist cases where a [version of the] likelihood function shows such a symmetry between the sufficient statistic and the parameter, as e.g. in the Normal mean model, that they are somewhat exchangeable w.r.t. the same measure, the question is somewhat meaningless for a number of reasons that we can all link to Ronald Fisher:

1. when defining the likelihood function, Fisher (in his 1912 undergraduate memoir!) warns against integrating it w.r.t. the parameter: “the integration with respect to m is illegitimate and has no definite meaning with respect to inverse probability”. The likelihood is “is a relative probability only, suitable to compare point with point, but incapable of being interpreted as a probability distribution over a region, or of giving any estimate of absolute probability.” And again in 1922: “[the likelihood] is not a differential element, and is incapable of being integrated: it is assigned to a particular point of the range of variation, not to a particular element of it”.
2. He introduced the term “likelihood” especially to avoid the confusion: “I perceive that the word probability is wrongly used in such a connection: probability is a ratio of frequencies, and about the frequencies of such values we can know nothing whatever (…) I suggest that we may speak without confusion of the likelihood of one value of p being thrice the likelihood of another (…) likelihood is not here used loosely as a synonym of probability, but simply to express the relative frequencies with which such values of the hypothetical quantity p would in fact yield the observed sample”.
3. Another point he makes repeatedly (both in 1912 and 1922) is the lack of invariance of the probability measure obtained by attaching a dθ to the likelihood function L(θ) and normalising it into a density: while the likelihood “is entirely unchanged by any [one-to-one] transformation”, this definition of a probability distribution is not. Fisher actually distanced himself from a Bayesian “uniform prior” throughout the 1920’s.

which sums up as the urge to never neglect the dominating measure!

This site uses Akismet to reduce spam. Learn how your comment data is processed.