Another harmonic mean approximation
Martin Weinberg posted on arXiv a revision of his paper, Computing the Bayesian Factor from a Markov chain Monte Carlo Simulation of the Posterior Distribution, that is submitted to Bayesian Analysis. I have already mentioned this paper in a previous post, but I remain unconvinced of the appeal of the paper method, given that it recovers the harmonic mean approximation to the marginal likelihood… The method is very close to John Skilling’s nested sampling, except that the simulation is run from the posterior rather than from the prior, hence the averaging on the inverse likelihoods and hence the harmonic mean connection. The difficulty with the original (Michael Newton and Adrian Raftery’s) harmonic mean estimator is attributed to “a few outlying terms with abnormally small values of” the likelihood, while, as clearly spelled out by Radford Neal, the poor behaviour of the harmonic mean estimator has nothing abnormal and is on the opposite easily explainable.
I must admit I found the paper difficult to read, partly because of the use of poor and ever-changing notations and partly because of the lack of mathematical rigour (see, e.g., eqn (11)). (And maybe also because of the current heat wave.) In addition to the switch from prior to posterior in the representation of the evidence, a novel perspective set in the paper seems to be an extension of the standard harmonic mean identity that relates to the general expression of Gelfand and Dey (1994, Journal of the Royal Statistical Society B) when using an indicator function as an instrumental function. There is therefore a connection with our proposal (made with Jean-Michel Marin) of considering an HPD region for excluding the tails of the likelihood, even though the set of integration is defined as “eliminating the divergent samples with “. This is essentially the numerical Lebesgue algorithm advanced as one of two innovative algorithms by Martin Weinberg. I wonder how closely related the second (volume tesselation) algorithm is to Huber and Schott’s TPA algorithm, in the sense that TPA also requires a “smaller” integral….