## prayers and chi-square

Posted in Books, Kids, Statistics, University life with tags , , , , , , on November 25, 2014 by xi'an

One study I spotted in Richard Dawkins’ The God delusion this summer by the lake is a study of the (im)possible impact of prayer over patient’s recovery. As a coincidence, my daughter got this problem in her statistics class of last week (my translation):

1802 patients in 6 US hospitals have been divided into three groups. Members in group A was told that unspecified religious communities would pray for them nominally, while patients in groups B and C did not know if anyone prayed for them. Those in group B had communities praying for them while those in group C did not. After 14 days of prayer, the conditions of the patients were as follows:

• out of 604 patients in group A, the condition of 249 had significantly worsened;
• out of 601 patients in group B, the condition of 289 had significantly worsened;
• out of 597 patients in group C, the condition of 293 had significantly worsened.

Use a chi-square procedure to test the homogeneity between the three groups, a significant impact of prayers, and a placebo effect of prayer.

This may sound a wee bit weird for a school test, but she is in medical school after all so it is a good way to enforce rational thinking while learning about the chi-square test! (Answers: [even though the data is too sparse to clearly support a decision, esp. when using the chi-square test!] homogeneity and placebo effect are acceptable assumptions at level 5%, while the prayer effect is not [if barely].)

## Number of components in a mixture

Posted in Books, R, Statistics, University life with tags , , , , , on August 6, 2011 by xi'an

I got a paper (unavailable online) to referee about testing for the order (i.e. the number of components) of a normal mixture. Although this is an easily spelled problem, namely estimate k in

$\sum_{i=1}^k p_{ik} \mathcal{N}(\mu_{ik},\sigma^2_{ik}),$

I came to the conclusion that it is a kind of ill-posed problem. Without a clear definition of what a component is, i.e. without a well-articulated prior distribution, I remain unconvinced that k can be at all estimated. Indeed, how can we distinguish between a k component mixture and a (k+1) component mixture with an extremely small (in the sense of the component weight) additional component? Solutions ending up with a convenient chi-square test thus sound unrealistic to me… I am not implying the maths are wrong in any way, simply that the meaning of the test and the nature of the null hypothesis are unclear from a practical and methodological perspective. In the case of normal (but also Laplace) mixtures, the difficulty is compounded by the fact that the likelihood function is unbounded, thus wide open to over-fitting (at least in a non-Bayesian setting). Since Ghosh and Sen (1985), authors have come up with various penalisation functions, but I remain openly atheistic about the approach! (I do not know whether or not this is related with the summer season, but I have received an unusual number of papers to referee lately, e.g., handling three papers last Friday, one on Saturday, and yet another one on Monday morning. Interestingly, about half of them are from  non-statistical journals!)