Archive for Neyman-Pearson

read paper [in Bristol]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on January 29, 2016 by xi'an

Clifton & Durdham Downs, Bristol, Sept. 25, 2012I went to give a seminar in Bristol last Friday and I chose to present the testing with mixture paper. As we are busy working on the revision, I was eagerly looking for comments and criticisms that could strengthen this new version. As it happened, the (Bristol) Bayesian Cake (Reading) Club had chosen our paper for discussion, two weeks in a row!, hence the title!, and I got invited to join the group the morning prior to the seminar! This was, of course, most enjoyable and relaxed, including an home-made cake!, but also quite helpful in assessing our arguments in the paper. One point of contention or at least of discussion was the common parametrisation between the components of the mixture. Although all parametrisations are equivalent from a single component point of view, I can [almost] see why using a mixture with the same parameter value on all components may impose some unsuspected constraint on that parameter. Even when the parameter is the same moment for both components. This still sounds like a minor counterpoint in that the weight should converge to either zero or one and hence eventually favour the posterior on the parameter corresponding to the “true” model.

Another point that was raised during the discussion is the behaviour of the method under misspecification or for an M-open framework: when neither model is correct does the weight still converge to the boundary associated with the closest model (as I believe) or does a convexity argument produce a non-zero weight as it limit (as hinted by one example in the paper)? I had thought very little about this and hence had just as little to argue though as this does not sound to me like the primary reason for conducting tests. Especially in a Bayesian framework. If one is uncertain about both models to be compared, one should have an alternative at the ready! Or use a non-parametric version, which is a direction we need to explore deeper before deciding it is coherent and convergent!

A third point of discussion was my argument that mixtures allow us to rely on the same parameter and hence the same prior, whether proper or not, while Bayes factors are less clearly open to this interpretation. This was not uniformly accepted!

Thinking afresh about this approach also led me to broaden my perspective on the use of the posterior distribution of the weight(s) α: while previously I had taken those weights mostly as a proxy to the posterior probabilities, to be calibrated by pseudo-data experiments, as for instance in Figure 9, I now perceive them primarily as the portion of the data in agreement with the corresponding model [or hypothesis] and more importantly as a solution for staying away from a Neyman-Pearson-like decision. Or error evaluation. Usually, when asked about the interpretation of the output, my answer is to compare the behaviour of the posterior on the weight(s) with a posterior associated with a sample from each model. Which does sound somewhat similar to posterior predictives if the samples are simulated from the associated predictives. But the issue was not raised during the visit to Bristol, which possibly reflects on how unfrequentist the audience was [the Statistics group is], as it apparently accepted with no further ado the use of a posterior distribution as a soft assessment of the comparative fits of the different models. If not necessarily agreeing the need of conducting hypothesis testing (especially in the case of the Pima Indian dataset!).

uniformly most powerful Bayesian tests???

Posted in Books, Statistics, University life with tags , , , , , , , on September 30, 2013 by xi'an

“The difficulty in constructing a Bayesian hypothesis test arises from the requirement to specify an alternative hypothesis.”

Vale Johnson published (and arXived) a paper in the Annals of Statistics on uniformly most powerful Bayesian tests. This is in line with earlier writings of Vale on the topic and good quality mathematical statistics, but I cannot really buy the arguments contained in the paper as being compatible with (my view of) Bayesian tests. A “uniformly most powerful Bayesian test” (acronymed as UMBT)  is defined as

“UMPBTs provide a new form of default, nonsubjective Bayesian tests in which the alternative hypothesis is determined so as to maximize the probability that a Bayes factor exceeds a specified threshold”

which means selecting the prior under the alternative so that the frequentist probability of the Bayes factor exceeding the threshold is maximal for all values of the parameter. This does not sound very Bayesian to me indeed, due to this averaging over all possible values of the observations x and comparing the probabilities for all values of the parameter θ rather than integrating against a prior or posterior and selecting the prior under the alternative with the sole purpose of favouring the alternative, meaning its further use when the null is rejected is not considered at all and catering to non-Bayesian theories, i.e. trying to sell Bayesian tools as supplementing p-values and arguing the method is objective because the solution satisfies a frequentist coverage (at best, this maximisation of the rejection probability reminds me of minimaxity, except there is no clear and generic notion of minimaxity in hypothesis testing).

Continue reading

reading classics (#9)

Posted in Books, Statistics, University life with tags , , , , , , , , on February 24, 2013 by xi'an

In today’s classics seminar, my student Bassoum Abou presented the 1981 paper written by Charles Stein for the Annals of Statistics, Estimating the mean of a normal distribution, recapitulating the advances he made on Stein estimators, minimaxity and his unbiased estimator of risk. Unfortunately; this student missed a lot about paper and did not introduce the necessary background…So I am unsure at how much the class got from this great paper… Here are his slides (watch out for typos!)

 Historically, this paper is important as this is one of the very few papers published by Charles Stein in a major statistics journal, the other publications being made in conference proceedings. It contains the derivation of the unbiased estimator of the loss, along with comparisons with posterior expected loss.

reading classics (#8)

Posted in Books, Statistics, University life with tags , , , , , , , , on February 1, 2013 by xi'an

In today’s classics seminar, my student Dong Wei presented the historical paper by Neyman and Pearson on efficient  tests: “On the problem of the most efficient tests of statistical hypotheses”, published in the Philosophical Transactions of the Royal Society, Series A. She had a very hard time with the paper… It is not an easy paper, to be sure, and it gets into convoluted and murky waters when it comes to the case of composite hypotheses testing. Once again, it would have been nice to broaden the view on testing by including some of the references given in Dong Wei’s slides:

Listening to this talk, while having neglected to read the original paper for many years (!), I was reflecting on the way tests, Type I & II, and critical regions were introduced, without leaving any space for a critical (!!) analysis of the pertinence of those concepts. This is an interesting paper also because it shows the limitations of such a notion of efficiency. Apart from the simplest cases, it is indeed close to impossible to achieve this efficiency because there is no most powerful procedure (without restricting the range of those procedures). I also noticed from the slides that Neyman and Pearson did not seem to use a Lagrange multiplier to achieve the optimal critical region. (Dong Wei also inverted the comparison of the sufficient and insufficient statistics for the test on the variance, as the one based on the sufficient statistic is more powerful.) In any case, I think I will not keep the paper in my list for next year, maybe replacing it with the Karlin-Rubin (1956) UMP paper…

Error and Inference [arXived]

Posted in Books, Statistics, University life with tags , , , , , , , on November 29, 2011 by xi'an

Following my never-ending series of posts on the book Error and Inference, (edited) by Deborah Mayo and Ari Spanos (and kindly sent to me by Deborah), I decided to edit those posts into a (slightly) more coherent document, now posted on arXiv. And to submit it as a book review to Siam Review, even though I had not high expectations it fits the purpose of the journal: the review was rejected between the submission to arXiv and the publication of this post!