This week, I happened to re-read John Storey’ 2003 “The positive discovery rate: a Bayesian interpretation and the q-value”, because I wanted to check a connection with our testing by mixture [still in limbo] paper. I however failed to find what I was looking for because I could not find any Bayesian flavour in the paper apart from an FRD expressed as a “posterior probability” of the null, in the sense that the setting was one of opposing two simple hypotheses. When there is an unknown parameter common to the multiple hypotheses being tested, a prior distribution on the parameter makes these multiple hypotheses connected. What makes the connection puzzling is the assumption that the observed statistics defining the significance region are independent (Theorem 1). And it seems to depend on the choice of the significance region, which should be induced by the Bayesian modelling, not the opposite. (This alternative explanation does not help either, maybe because it is on baseball… Or maybe because the sentence “If a player’s [posterior mean] is above .3, it’s more likely than not that their true average is as well” does not seem to appear naturally from a Bayesian formulation.) [Disclaimer: I am not hinting at anything wrong or objectionable in Storey’s paper, just being puzzled by the Bayesian tag!]
Archive for FDRs
a Bayesian interpretation of FDRs?
Posted in Statistics with tags baseball data, empirical Bayes methods, false discovery rate, FDRs, ferry harbour, FNR, hypothesis testing, multiple tests, Seattle, shrinkage estimation, Washington State on April 12, 2018 by xi'anrobust Bayesian FDR control with Bayes factors [a reply]
Posted in Statistics, University life with tags Bayes factors, design of experiments, Dirichlet process, empirical Bayes methods, FDRs on January 17, 2014 by xi'an(Following my earlier discussion of his paper, Xiaoquan Wen sent me this detailed reply.)
I think it is appropriate to start my response to your comments by introducing a little bit of the background information on my research interest and the project itself: I consider myself as an applied statistician, not a theorist, and I am interested in developing theoretically sound and computationally efficient methods to solve practical problems. The FDR project originated from a practical application in genomics involving hypothesis testing. The details of this particular application can be found in this published paper, and the simulations in the manuscript are also designed for a similar context. In this application, the null model is trivially defined, however there exist finitely many alternative scenarios for each test. We proposed a Bayesian solution that handles this complex setting quite nicely: in brief, we chose to model each possible alternative scenario parametrically, and by taking advantage of Bayesian model averaging, Bayes factor naturally ended up as our test statistic. We had no problem in demonstrating the resulting Bayes factor is much more powerful than the existing approaches, even accounting for the prior (mis-)modeling for Bayes factors. However, in this genomics application, there are potentially tens of thousands of tests need to be simultaneously performed, and FDR control becomes necessary and challenging. Continue reading