Archive for PNAS

how many academics does it take to change… a p-value threshold?

Posted in Books, pictures, Running, Statistics, Travel with tags , , , , , , , , on August 22, 2017 by xi'an

“…a critical mass of researchers now endorse this change.”

The answer to the lightpulp question seems to be 72: Andrew sent me a short paper recently PsyarXived and to appear in Nature Human Behaviour following on the .005 not .05 tune we criticised in PNAS a while ago. (Actually a very short paper once the names and affiliations of all authors are taken away.) With indeed 72 authors, many of them my Bayesian friends! I figure the mass signature is aimed at convincing users of p-values of a consensus among statisticians. Or a “critical mass” as stated in the note. On the next week, Nature had an entry on this proposal. (With a survey on whether the p-value threshold should change!)

The argument therein [and hence my reservations] is about the same as in Val Johnson’s original PNAS paper, namely that .005 should become the reference cutoff when using p-values for discovering new effects. The tone of the note is mostly Bayesian in that it defends the Bayes factor as a better alternative I would call the b-value. And produces graphs that relate p-values to some minimax Bayes factors. In the simplest possible case of testing for the nullity of a normal mean. Which I do not think is particularly convincing when considering more realistic settings with (many) nuisance parameters and possible latent variables where numerical answers diverge between p-values and [an infinity of] b-values. And of course the unsolved issue of scaling the Bayes factor. (This without embarking anew upon a full-fledged criticism of the Bayes factor.) As usual, I am also skeptical of mentions of power, since I never truly understood the point of power, which depends on the alternative model, increasingly so with the complexity of this alternative. As argued in our letter to PNAS, the central issue that this proposal fails to address is the urgency in abandoning the notion [indoctrinated in generations of students that a single quantity and a single bound are the answers to testing issues. Changing the bound sounds like suggesting to paint afresh a building on the verge of collapsing.

contemporary issues in hypothesis testing

Posted in Statistics with tags , , , , , , , , , , , , , , , , , , on September 26, 2016 by xi'an

hipocontemptThis week [at Warwick], among other things, I attended the CRiSM workshop on hypothesis testing, giving the same talk as at ISBA last June. There was a most interesting and unusual talk by Nick Chater (from Warwick) about the psychological aspects of hypothesis testing, namely about the unnatural features of an hypothesis in everyday life, i.e., how far this formalism stands from human psychological functioning.  Or what we know about it. And then my Warwick colleague Tom Nichols explained how his recent work on permutation tests for fMRIs, published in PNAS, testing hypotheses on what should be null if real data and getting a high rate of false positives, got the medical imaging community all up in arms due to over-simplified reports in the media questioning the validity of 15 years of research on fMRI and the related 40,000 papers! For instance, some of the headings questioned the entire research in the area. Or transformed a software bug missing the boundary effects into a major flaw.  (See this podcast on Not So Standard Deviations for a thoughtful discussion on the issue.) One conclusion of this story is to be wary of assertions when submitting a hot story to journals with a substantial non-scientific readership! The afternoon talks were equally exciting, with Andrew explaining to us live from New York why he hates hypothesis testing and prefers model building. With the birthday model as an example. And David Draper gave an encompassing talk about the distinctions between inference and decision, proposing a Jaynes information criterion and illustrating it on Mendel‘s historical [and massaged!] pea dataset. The next morning, Jim Berger gave an overview on the frequentist properties of the Bayes factor, with in particular a novel [to me] upper bound on the Bayes factor associated with a p-value (Sellke, Bayarri and Berger, 2001)

B¹⁰(p) ≤ 1/-e p log p

with the specificity that B¹⁰(p) is not testing the original hypothesis [problem] but a substitute where the null is the hypothesis that p is uniformly distributed, versus a non-parametric alternative that p is more concentrated near zero. This reminded me of our PNAS paper on the impact of summary statistics upon Bayes factors. And of some forgotten reference studying Bayesian inference based solely on the p-value… It is too bad I had to rush back to Paris, as this made me miss the last talks of this fantastic workshop centred on maybe the most important aspect of statistics!

ABC model choice via random forests accepted!

Posted in Books, pictures, Statistics, University life with tags , , , , , on October 21, 2015 by xi'an

treerise6“This revision represents a very nice response to the earlier round of reviews, including a significant extension in which the posterior probability of the selected model is now estimated (whereas previously this was not included). The extension is a very nice one, and I am happy to see it included.” Anonymous

Great news [at least for us], our paper on ABC model choice has been accepted by Bioninformatics! With the pleasant comment above from one anonymous referee. This occurs after quite a prolonged gestation, which actually contributed to a shift in our understanding and our implementation of the method. I am still a wee bit unhappy at the rejection by PNAS, but it paradoxically led to a more elaborate article. So all is well that ends well! Except the story is not finished and we have still exploring the multiple usages of random forests in ABC.

ABC model choice via random forests [and no fire]

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on September 4, 2015 by xi'an

While my arXiv newspage today had a puzzling entry about modelling UFOs sightings in France, it also broadcast our revision of Reliable ABC model choice via random forests, version that we resubmitted today to Bioinformatics after a quite thorough upgrade, the most dramatic one being the realisation we could also approximate the posterior probability of the selected model via another random forest. (With no connection with the recent post on forest fires!) As discussed a little while ago on the ‘Og. And also in conjunction with our creating the abcrf R package for running ABC model choice out of a reference table. While it has been an excruciatingly slow process (the initial version of the arXived document dates from June 2014, the PNAS submission was rejected for not being enough Bayesian, and the latest revision took the whole summer), the slow maturation of our thoughts on the model choice issues led us to modify the role of random forests in the ABC approach to model choice, in that we reverted our earlier assessment that they could only be trusted for selecting the most likely model, by realising this summer the corresponding posterior could be expressed as a posterior loss and estimated by a secondary forest. As first considered in Stoehr et al. (2014). (In retrospect, this brings an answer to one of the earlier referee’s comments.) Next goal is to incorporate those changes in DIYABC (and wait for the next version of the software to appear). Another best-selling innovation due to Arnaud: we added a practical implementation section in the format of FAQ for issues related with the calibration of the algorithms.

likelihood-free model choice

Posted in Books, pictures, Statistics, University life, Wines with tags , , , , , , , on March 27, 2015 by xi'an

Jean-Michel Marin, Pierre Pudlo and I just arXived a short review on ABC model choice, first version of a chapter for the incoming Handbook of Approximate Bayesian computation edited by Scott Sisson, Yannan Fan, and Mark Beaumont. Except for a new analysis of a Human evolution scenario, this survey mostly argues for the proposal made in our recent paper on the use of random forests and [also argues] about the lack of reliable approximations to posterior probabilities. (Paper that was rejected by PNAS and that is about to be resubmitted. Hopefully with a more positive outcome.) The conclusion of the survey is  that

The presumably most pessimistic conclusion of this study is that the connections between (i) the true posterior probability of a model, (ii) the ABC version of this probability, and (iii) the random forest version of the above, are at best very loose. This leaves open queries for acceptable approximations of (i), since the posterior predictive error is instead an error assessment for the ABC RF model choice procedure. While a Bayesian quantity that can be computed at little extra cost, it does not necessarily compete with the posterior probability of a model.

reflecting my hope that we can eventually come up with a proper approximation to the “true” posterior probability…

nested sampling for systems biology

Posted in Books, Statistics, University life with tags , , , , on January 14, 2015 by xi'an

In conjunction with the recent PNAS paper on massive model choice, Rob Johnson†, Paul Kirk and Michael Stumpf published in Bioinformatics an implementation of nested sampling that is designed for biological applications, called SYSBIONS. Hence the NS for nested sampling! The C software is available on-line. (I had planned to post this news next to my earlier comments but it went under the radar…)

parallelising MCMC algorithms

Posted in Books, Statistics, University life with tags , , , on December 23, 2014 by xi'an

This paper, A general construction for parallelizing Metropolis-Hastings algorithms, written by Ben Calderhead, was first presented at MCMSki last January and has now appeared in PNAS. It is somewhat related to the recycling idea of Tjelmeland (2004, unpublished) and hence to our 1996 Rao-Blackwellisation paper with George. Although there is no recycling herein.

At each iteration of Ben’s algorithm, N proposed values are generated conditional on the “current” value of the Markov chain, which actually consists of (N+1) components and from which one component is drawn at random to serve as a seed for the next proposal distribution and the simulation of N other values. In short, this is a data-augmentation scheme with the index I on the one side and the N modified components on the other side. The neat trick in the proposal [and the reason for the jump in efficiency] is that the stationary distribution of the auxiliary variable can be determined and hence used (N+1) times in updating the vector of (N+1) components. (Note that picking the index at random means computing all (N+1) possible transitions from one component to the N others. Or even all (N+1)! if the proposals differ. Hence a potential increase in the computing cost, even though what costs the most is usually the likelihood computation, dispatched on the parallel processors.) While there are (N+1) terms involved at each step, the genuine Markov chain is truly over a single chain and the N other proposed values are not recycled. Even though they could be [for Monte Carlo integration purposes], as shown e.g. in our paper with Pierre Jacob and Murray Smith. Something that took a few iterations for me to understand is why Ben rephrases the original Metropolis-Hastings algorithm as a finite state space Markov chain on the set of indices {1,…,N+1} (Proposition 1). Conditionally on the values of the (N+1) vector, the stationary of that sub-chain is no longer uniform. Hence, picking (N+1) indices from the stationary helps in selecting the most appropriate images, which explains why the rejection rate decreases.

The paper indeed evaluates the impact of increasing the number of proposals in terms of effective sample size (ESS), acceptance rate, and mean squared jump distance, based two examples. As often in parallel implementations, the paper suggests an “N-fold increase in computational speed” even though this is simply the effect of running the same algorithm on a single processor and on N parallel processors. If the comparison is between a single proposal Metropolis-Hastings algorithm on a single processor and an N-fold proposal on N processors, I would say the latter is slower because of the selection of the index I that forces all pairs of reverse move.  Nonetheless, since this is an almost free bonus resulting from using N processors, when compared with more complex coupled chains, it sounds worth investigating and comparing with those more complex parallel schemes.