This recent arXiv posting by Martin Weinberg and co-authors was pointed out to me by friends because of its title! It indeed sounded a bit inflated. And also reminded me of old style papers where the title was somehow the abstract. Like An Essay towards Solving a Problem in the Doctrine of Chances… So I had a look at it on my way to Gainesville. The paper starts from the earlier paper by Weinberg (2012) in Bayesian Analysis where he uses an HPD region to determine the Bayes factor by a safe harmonic mean estimator (an idea we already advocated earlier with Jean-Michel Marin in the San Antonio volume and with Darren Wraith in the MaxEnt volume). An extra idea is to try to optimise [against the variance of the resulting evidence] the region over which the integration is performed: “choose a domain that results in the most accurate integral with the smallest number of samples” (p.3). The authors proceed by volume peeling, using some quadrature formula for the posterior coverage of the region, either by Riemann or Lebesgue approximations (p.5). I was fairly lost at this stage and the third proposal based on adaptively managing hyperrectangles (p.7) went completely over my head! The sentence “the results are clearly worse with O(∞) errors, but are still remarkably better for high dimensionality”(p.11) did not make sense either… The method may thus be remarkably simple, but the paper is not written in a way that conveys this impression!
Archive for Lebesgue integration
a remarkably simple and accurate method for computing the Bayes factor &tc.
Posted in Statistics with tags Bayes factors, harmonic mean, HPD region, Lebesgue integration, MaxEnt2009, quadrature, Riemann integration, San Antonio, Thomas Bayes on February 13, 2013 by xi'anFrequency vs. probability
Posted in Statistics with tags E.T. Jaynes, Kolmogorov, Laplace succession rule, Lebesgue integration, measure theory, probability theory, The Bayesian Choice, urn models on May 6, 2011 by xi'an“Probabilities obtained by maximum entropy cannot be relevant to physical predictions because they have nothing to do with frequencies.” E.T. Jaynes, PT, p.366
“A frequency is a factual property of the real world that we measure or estimate. The phrase `estimating a probability’ is just as much an incongruity as `assigning a frequency’. The fundamental, inescapable distinction between probability and frequency lies in this relativity principle: probabilities change when we change our state of knowledge, frequencies do not.” E.T. Jaynes, PT, p.292
A few days ago, I got the following email exchange with Jelle Wybe de Jong from The Netherlands:
Q. I have a question regarding your slides of your presentation of Jaynes’ Probability Theory. You used the [above second] quote: Do you agree with this statement? It seems to me that a lot of ‘Bayesians’ still refer to ‘estimating’ probabilities. Does it make sense for example for a bank to estimate a probability of default for their loan portfolio? Or does it only make sense to estimate a default frequency and summarize the uncertainty (state of knowledge) through the posterior? Read more »
Quadrature methods for evidence approximation
Posted in Statistics with tags Bayesian model choice, evidence, harmonic mean estimator, Lebesgue integration, nested sampling, Voronoi tesselation on November 13, 2009 by xi'anTwo papers written by astronomers have been recently posted on arXiv about (new) ways to approximate evidence. Since they both perceive those approximations as some advanced form of quadrature, they are close enough that a comparison makes sense.
The paper by Rutger van Haasteren uses a Voronoi tessellation to represent the evidence as
when the ‘s are simulated from the normalised version of
and the
‘s are the associated Voronoi cells. This approximation converges (even when the
‘s are not simulated from the right distribution) but it cannot be used in practice because of the cost of the Voronoi tessellation. Instead, Rutger van Haasteren suggests using a sort of an approximate HPD region
and its volume,
, along with an harmonic mean within the HPD region:
where is the total number of simulations. So in the end this solution is actually the one proposed in our paper with Darren Wraith, as described in this earlier post! It is thus nice to see an application of this idea in a realistic situation, with performances that compare with nested sampling in its MultiNest version of Feroz, Hobson and Bridges. (This is especially valuable when considering that nested sampling is often presented as the only solution to approximating evidence.)
The second paper by Martin Weinberg also adopt a quadrature perspective, while integrating in the Lebesgue sense rather than in the Riemann sense. This perspective applies to nested sampling even though John Skilling does not justify nested sampling that way but Martin Weinberg also shows that the (infamous) harmonic mean estimator also is a Lebesgue-quadrature approximation. The solution proposed in the paper is a different kind of truncation on the functional values, that relates more to nested sampling and on which I hope to report more thoroughly later.