Archive for psychology

and it only gets worse…

Posted in Kids, pictures with tags , , , , , , , , , , , , , on July 14, 2017 by xi'an

“Medicaid pays for most of the 1.4 million people in nursing homes (…) With more than 70 million people enrolled in Medicaid, the program certainly faces long-term financial challenges. Certainly, nursing homes would be part of those cuts, not only in reimbursement rates but in reductions in eligibility for nursing home care.” NYT, June 14, 2017

“…the architects of the Trump contraceptive reversal, Ms. Talento, a White House domestic policy aide, and Mr. Bowman, a top lawyer at the Department of Health and Human Services, have the experience and know-how that others in the administration lack. As a lawyer at the Alliance Defending Freedom, Mr. Bowman assailed the contraceptive coverage mandate on behalf of colleges, universities and nonprofit groups that had religious objections to the rule. Ms. Talento, a former aide to Senator Thom Tillis, Republican of North Carolina, spent years warning about the health risks of birth control pills.” NYT, July 11, 2017

“Mr. Trump’s revised executive order, issued in March, limited travel from six mostly Muslim countries for 90 days and suspended the nation’s refugee program for 120 days. The time was needed, the order said, to address gaps in the government’s screening and vetting procedures. Two federal appeals courts have blocked critical parts of the order. The administration had asked that the the lower-court ruling be stayed while the case moves forward. The court granted part of that request in its unsigned opinion. NYT, June 24, 2017

The proposed legislation, which Planned Parenthood labels “the worst bill for women’s health in a generation,” would strip the organization of federal funding for one year and bar any federal tax credits from being used to help buy private health plans that cover abortions.” NYT, June 23, 2017

and it only gets worse…

Posted in Kids, pictures with tags , , , , , , , , , , , , on May 30, 2017 by xi'an

“Four years after Texas gave up millions of dollars in federal Medicaid funds so it could ban Planned Parenthood from participating in a family planning program for low-income women, the state is asking the Trump administration for the money back. If the administration agrees to restore the funding for Texas, it could effectively give states the green light to ban Planned Parenthood from Medicaid family planning programs with no financial consequences.” NYT, May 16, 2017

“According to budget documents obtained by the Washington Post, the Trump administration plans to end the Public Service Loan Forgiveness program, which more than 400,000 people are counting on as part of their financial future. Signed into law in 2007 by George W Bush (…), the program offers those whose jobs benefit society – government and non-profit employees – the chance to have their student loans forgiven after 10 years of on-time, income-based payments.” The Guardian, May 20, 2017

The analogy is pervasive among his critics: Donald Trump is like a child. Making him the president was like making a 4-year-old the leader of the free world. But the analogy is profoundly wrong, and it’s unfair to children. The scientific developmental research of the past 30 years shows that Mr. Trump is utterly unlike a 4-year-old. Four-year-olds care deeply about the truth (…) are insatiably curious (…) can pay attention (…) have a strong moral sense.” NYT, May 21, 2017

“If any president tries to impede an investigation — any president, no matter who it is — by interfering with the F.B.I., yes, that would be problematic,” Senator Marco Rubio, on the Senate Intelligence Committee, said on CNN. “It would be not just problematic. It would be, obviously, a potential obstruction of justice that people have to make a decision on.”  NYT, May 21, 2017

“A bill to dismantle the Affordable Care Act that narrowly passed the House this month would leave 14 million more people uninsured next year than under President Barack Obama’s health law — and 23 million more in 2026. Some of the nation’s sickest would pay much more for health care.” NYT, May 25, 2017

estimation versus testing [again!]

Posted in Books, Statistics, University life with tags , , , , , , , , , , on March 30, 2017 by xi'an

The following text is a review I wrote of the paper “Parameter estimation and Bayes factors”, written by J. Rouder, J. Haff, and J. Vandekerckhove. (As the journal to which it is submitted gave me the option to sign my review.)

The opposition between estimation and testing as a matter of prior modelling rather than inferential goals is quite unusual in the Bayesian literature. In particular, if one follows Bayesian decision theory as in Berger (1985) there is no such opposition, but rather the use of different loss functions for different inference purposes, while the Bayesian model remains single and unitarian.

Following Jeffreys (1939), it sounds more congenial to the Bayesian spirit to return the posterior probability of an hypothesis H⁰ as an answer to the question whether this hypothesis holds or does not hold. This however proves impossible when the “null” hypothesis H⁰ has prior mass equal to zero (or is not measurable under the prior). In such a case the mathematical answer is a probability of zero, which may not satisfy the experimenter who asked the question. More fundamentally, the said prior proves inadequate to answer the question and hence to incorporate the information contained in this very question. This is how Jeffreys (1939) justifies the move from the original (and deficient) prior to one that puts some weight on the null (hypothesis) space. It is often argued that the move is unnatural and that the null space does not make sense, but this only applies when believing very strongly in the model itself. When considering the issue from a modelling perspective, accepting the null H⁰ means using a new model to represent the model and hence testing becomes a model choice problem, namely whether or not one should use a complex or simplified model to represent the generation of the data. This is somehow the “unification” advanced in the current paper, albeit it does appear originally in Jeffreys (1939) [and then numerous others] rather than the relatively recent Mitchell & Beauchamp (1988). Who may have launched the spike & slab denomination.

I have trouble with the analogy drawn in the paper between the spike & slab estimate and the Stein effect. While the posterior mean derived from the spike & slab posterior is indeed a quantity drawn towards zero by the Dirac mass at zero, it is rarely the point in using a spike & slab prior, since this point estimate does not lead to a conclusion about the hypothesis: for one thing it is never exactly zero (if zero corresponds to the null). For another thing, the construction of the spike & slab prior is both artificial and dependent on the weights given to the spike and to the slab, respectively, to borrow expressions from the paper. This approach thus leads to model averaging rather than hypothesis testing or model choice and therefore fails to answer the (possibly absurd) question as to which model to choose. Or refuse to choose. But there are cases when a decision must be made, like continuing a clinical trial or putting a new product on the market. Or not.

In conclusion, the paper surprisingly bypasses the decision-making aspect of testing and hence ends up with a inconclusive setting, staying midstream between Bayes factors and credible intervals. And failing to provide a tool for decision making. The paper also fails to acknowledge the strong dependence of the Bayes factor on the tail behaviour of the prior(s), which cannot be [completely] corrected by a finite sample, hence its relativity and the unreasonableness of a fixed scale like Jeffreys’ (1939).

contemporary issues in hypothesis testing

Posted in Statistics with tags , , , , , , , , , , , , , , , , , , on September 26, 2016 by xi'an

hipocontemptThis week [at Warwick], among other things, I attended the CRiSM workshop on hypothesis testing, giving the same talk as at ISBA last June. There was a most interesting and unusual talk by Nick Chater (from Warwick) about the psychological aspects of hypothesis testing, namely about the unnatural features of an hypothesis in everyday life, i.e., how far this formalism stands from human psychological functioning.  Or what we know about it. And then my Warwick colleague Tom Nichols explained how his recent work on permutation tests for fMRIs, published in PNAS, testing hypotheses on what should be null if real data and getting a high rate of false positives, got the medical imaging community all up in arms due to over-simplified reports in the media questioning the validity of 15 years of research on fMRI and the related 40,000 papers! For instance, some of the headings questioned the entire research in the area. Or transformed a software bug missing the boundary effects into a major flaw.  (See this podcast on Not So Standard Deviations for a thoughtful discussion on the issue.) One conclusion of this story is to be wary of assertions when submitting a hot story to journals with a substantial non-scientific readership! The afternoon talks were equally exciting, with Andrew explaining to us live from New York why he hates hypothesis testing and prefers model building. With the birthday model as an example. And David Draper gave an encompassing talk about the distinctions between inference and decision, proposing a Jaynes information criterion and illustrating it on Mendel‘s historical [and massaged!] pea dataset. The next morning, Jim Berger gave an overview on the frequentist properties of the Bayes factor, with in particular a novel [to me] upper bound on the Bayes factor associated with a p-value (Sellke, Bayarri and Berger, 2001)

B¹⁰(p) ≤ 1/-e p log p

with the specificity that B¹⁰(p) is not testing the original hypothesis [problem] but a substitute where the null is the hypothesis that p is uniformly distributed, versus a non-parametric alternative that p is more concentrated near zero. This reminded me of our PNAS paper on the impact of summary statistics upon Bayes factors. And of some forgotten reference studying Bayesian inference based solely on the p-value… It is too bad I had to rush back to Paris, as this made me miss the last talks of this fantastic workshop centred on maybe the most important aspect of statistics!

measuring honesty, with p=.006…

Posted in Books, Kids, pictures, Statistics with tags , , , , , on April 19, 2016 by xi'an

Simon Gächter and Jonathan Schulz recently published a paper in Nature attempting to link intrinsic (individual) honesty with a measure of corruption in the subject home country. Out of more than 2,500 subjects in 23 countries. [I am now reading Nature on a regular basis, thanks to our lab subscribing a coffee room subscription!] Now I may sound naïvely surprised at the methodological contents of the paper and at a publication in Nature but I never read psychology papers, only Andrew’s rants at’em!!!

“The results are consistent with theories of the cultural co-evolution of institutions and values, and show that weak institutions and cultural legacies that generate rule violations not only have direct adverse economic consequences, but might also impair individual intrinsic honesty that is crucial for the smooth functioning of society.”

The experiment behind this article and its rather deep claims is however quite modest: the authors asked people to throw a dice twice without monitoring and rewarded them according to the reported result of the first throw. Being dishonest here means reporting a false result towards a larger monetary gain. This sounds rather artificial and difficult to relate to dishonest behaviours in realistic situations, as I do not see much appeal in cheating for 50 cents or so. Especially since the experiment accounted for a difference in wealth backgrounds, by adapting to the hourly wage in the country (“from $0.7 dollar in Vietnam to $4.2 in the Netherlands“). Furthermore, the subjects of this experiment were undergraduate students in economics departments: depending on the country, this may create a huge bias in terms of social background, as I do not think access to universities is the same in Germany and in Guatemala, say.

“Our expanded scope of societies therefore provides important support and qualifications for the generalizability of these theories—people benchmark their justifiable dishonesty with the extent of dishonesty they see in their societal environment.”

The statistical analysis behind this “important support” is not earth-shattering either. The main argument is based on the empirical cdfs of the gain repartitions per country (in the above graph), with tests that the overall empirical cdf for low corruption countries is higher than the corresponding one for high corruption countries. The comparison of the cumulated or pooled cdf across countries from each group is disputable, in that there is no reason the different countries have the same “honesty” cdf. The groups themselves are built on a rough measure of “prevalence of rule violations”. It is also rather surprising that for both groups the percentage of zero gain answers is “significantly” larger than the expected value of 2.8% if the assumption of “justified dishonesty” holds. In any case, there is no compelling argument as to why students not reporting the value of the first dice would naturally opt for the maximum of the two dices. Hence a certain bemusement at this paper appearing in Nature and even deserving an introductory coverage in the first part of the journal…

Le Monde and the replication crisis

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , , , , , on September 17, 2015 by xi'an

An rather poor coverage of the latest article in Science on the replication crisis in psychology in Le Monde Sciences & Medicine weekly pages (and mentioned a few days ago on Andrew’s blog, with the terrific if unrelated poster for Blade Runner…):

L’étude repose également sur le rôle d’un critère très critiqué, la “valeur p”, qui est un indicateur statistique estimant la probabilité que l’effet soit bien significatif.

As you may guess from the above (pardon my French!), the author of this summary of the Science article (a) has never heard of a p-value (which translates as niveau de signification in French statistics books) and (b) confuses the probability of exceeding the observed quantity under the null with the probability of the alternative. The remainder of the paper is more classical, pointing out the need for preregistered protocols in experimental sciences. Even though it mostly states evidence, like the decrease in significant effects for prepublished protocols. Apart from this mostly useless entry, rather interesting snapshots in the issue: Stephen Hawking’s views on how information could escape a black hole, an IBM software for predicting schizophrenia, Parkinson disease as a result of hyperactive neurons, diseased Formica fusca ants taking some harmful drugs to heal, …

reis naar Amsterdam

Posted in Books, Kids, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , on April 16, 2015 by xi'an

Amster4On Monday, I went to Amsterdam to give a seminar at the University of Amsterdam, in the department of psychology. And to visit Eric-Jan Wagenmakers and his group there. And I had a fantastic time! I talked about our mixture proposal for Bayesian testing and model choice without getting hostile or adverse reactions from the audience, quite the opposite as we later discussed this new notion for several hours in the café across the street. I also had the opportunity to meet with Peter Grünwald [who authored a book on the minimum description length principle] pointed out a minor inconsistency of the common parameter approach, namely that the Jeffreys prior on the first model did not have to coincide with the Jeffreys prior on the second model. (The Jeffreys prior for the mixture being unavailable.) He also wondered about a more conservative property of the approach, compared with the Bayes factor, in the sense that the non-null parameter could get closer to the null-parameter while still being identifiable.

Amster6Among the many persons I met in the department, Maarten Marsman talked to me about his thesis research, Plausible values in statistical inference, which involved handling the Ising model [a non-sparse Ising model with O(p²) parameters] by an auxiliary representation due to Marc Kac and getting rid of the normalising (partition) constant by the way. (Warning, some approximations involved!) And who showed me a simple probit example of the Gibbs sampler getting stuck as the sample size n grows. Simply because the uniform conditional distribution on the parameter concentrates faster (in 1/n) than the posterior (in 1/√n). This does not come as a complete surprise as data augmentation operates in an n-dimensional space. Hence it requires more time to get around. As a side remark [still worth printing!], Maarten dedicated his thesis as “To my favourite random variables , Siem en Fem, and to my normalizing constant, Esther”, from which I hope you can spot the influence of at least two of my book dedications! As I left Amsterdam on Tuesday, I had time for a enjoyable dinner with E-J’s group, an equally enjoyable early morning run [with perfect skies for sunrise pictures!], and more discussions in the department. Including a presentation of the new (delicious?!) Bayesian software developed there, JASP, which aims at non-specialists [i.e., researchers unable to code in R, BUGS, or, God forbid!, STAN] And about the consequences of mixture testing in some psychological experiments. Once again, a fantastic time discussing Bayesian statistics and their applications, with a group of dedicated and enthusiastic Bayesians!Amster12