**L**arry Wasserman wrote a blog entry on the normalizing constant paradox, where he repeats that he does not understand my earlier point…Let me try to recap here this point and the various comments I made on StackExchange *(while keeping in mind all this is for intellectual fun!)*

**T**he entry is somehow paradoxical in that Larry acknowledges (in that post) that the analysis in his book, *All of Statistics*, is wrong. The fact that *“g(x)/c is a valid density only for one value of c”* (and hence cannot lead to a notion of likelihood on *c*) is the very reason why I stated that there can be no statistical inference nor prior distribution about *c*: a sample from *f* does not bring *statistical information* about *c* and there can be no *statistical* estimate of *c* based on this sample. (In case you did not notice, I insist upon *statistical*!)

** T**o me this problem is completely different from a statistical problem, at least in the modern sense: if I need to approximate the constant *c*—as I do in fact when computing Bayes factors—, I can produce an arbitrarily long sample from a certain importance distribution and derive a converging (and sometimes unbiased) approximation of *c*. Once again, this is Monte Carlo integration, a numerical technique based on the Law of Large Numbers and the stabilisation of frequencies. (Call it a *frequentist* method if you wish. I completely agree that MCMC methods are inherently *frequentist* in that sense, And see no problem with this because they are not *statistical* methods. Of course, this may be the core of the disagreement with Larry and others, that they call statistics the Law of Large Numbers, and I do not. This lack of separation between both notions also shows up in a recent general public talk on Poincaré’s mistakes by Cédric Villani! All this may just mean I am irremediably Bayesian, seeing anything motivated by frequencies as non-statistical!) But that process does not mean that *c* can take a range of values that would index a family of densities compatible with a given sample. In this Monte Carlo integration approach, the distribution of the sample is completely under control (modulo the errors induced by pseudo-random generation). This approach is therefore outside the realm of Bayesian analysis *“that puts distributions on fixed but unknown constants”*, because those unknown constants parameterise the distribution of an observed sample. Ergo, *c* is not a parameter of the sample and the sample Larry argues about (*“we have data sampled from a distribution”*) contains no information whatsoever about *c* that is not already in the function *g*. (It is not “data” in this respect, but a stochastic sequence that can be used for approximation purposes.) Which gets me back to my first argument, namely that *c* is known (and at the same time difficult or impossible to compute)!

** L**et me also answer here the comments on *“why is this any different from estimating the speed of light c?”* *“why can’t you do this with the 100th digit of π?”* on the earlier post or on StackExchange. Estimating the speed of light means for me (who repeatedly flunked Physics exams after leaving high school!) that we have a physical experiment that measures the speed of light (as the original one by Rœmer at the Observatoire de Paris I visited earlier last week) and that the statistical analysis infers about *c* by using those measurements and the impact of the imprecision of the measuring instruments (as we do when analysing astronomical data). If, now, there exists a physical formula of the kind

where φ is a probability density, I can imagine stochastic approximations of *c* based on this formula, but I do not consider it a statistical problem any longer. The case is thus clearer for the 100th digit of *π*: it is also a fixed number, that I can approximate by a stochastic experiment but on which I cannot attach a statistical tag. (It is 9, by the way.) Throwing darts at random as I did during my Oz tour is not a statistical procedure, but simple Monte Carlo à la Buffon…

** O**verall, I still do not see this as a paradox for our field (and certainly not as a critique of Bayesian analysis), because there is no reason a statistical technique should be able to address any and every numerical problem. (Once again, Persi Diaconis would almost certainly differ, as he defended a Bayesian perspective on numerical analysis in the early days of MCMC…) There may be a “Bayesian” solution to this particular problem (and that would nice) and there may be none (and that would be OK too!), but I am not even convinced I would call this solution “Bayesian”! *(Again, let us remember this is mostly for intellectual fun!)*

### Like this:

Like Loading...