## estimating the marginal likelihood (or an information criterion)

Tory Imai (from Kyoto University) arXived a paper last summer on what first looked like a novel approximation of the marginal likelihood. Based on the variance of thermodynamic integration. The starting argument is that there exists a power 0<t⁰<1 such that the expectation of the logarithm of the product of the prior by the likelihood to the power t⁰ or t⁰-powered likelihood  is equal to the standard log-marginal

$\log m(x) = \mathbb{E}^{t^0}[ \log f(X|\theta) ]$

when the expectation is under the posterior corresponding to the t⁰-powered likelihood (rather than the full likelihood). By an application of the mean value theorem. Watanabe’s (2013) WBIC replaces the optimum t⁰ with 1/log(n), n being the sample size. The issue in terms of computational statistics is of course that the error of WBIC (against the true log m(x)) is only characterised as an order of n.

The second part of the paper is rather obscure to me, as the motivation for the real log canonical threshold is missing, even though the quantity is connected with the power likelihood. And the DIC effective dimension. It then goes on to propose a new approximation of sBIC, where s stands for singular, of Drton and Plummer (2017) which I had missed (and may ask my colleague Martin later today at Warwick!). Quickly reading through the later however brings explanations about the real log canonical threshold being simply the effective dimension in Schwarwz’s BIC approximation to the log marginal,

$\log m(x) \approx= \log f(x|\hat{\theta}_n) - \lambda \log n +(m-1)\log\log n$

(as derived by Watanabe), where m is called the multiplicity of the real log canonical threshold. Both λ and m being unknown, Drton and Plummer (2017) estimate the above approximation in a Bayesian fashion, which leads to a double indexed marginal approximation for a collection of models. Since this thread leads me further and further from a numerical resolution of the marginal estimation, but brings in a different perspective on mixture Bayesian estimation, I will return to this highly  in a later post. The paper of Imai discusses a different numerical approximation to sBIC, With a potential improvement in computing sBIC. (The paper was proposed as a poster to BayesComp 2020, so I am looking forward discussing it with the author.)

This site uses Akismet to reduce spam. Learn how your comment data is processed.