Archive for Significance

p-values and decision-making [reposted]

Posted in Statistics, University life, Books with tags , , , , , , , , , , on August 30, 2017 by xi'an

In a letter to Significance about a review of Robert Matthews’s book, Chancing it, Nicholas Longford recalls a few basic facts about p-values and decision-making earlier made by Dennis Lindley in Making Decisions. Here are some excerpts, worth repeating in the light of the 0.005 proposal:

“A statement of significance based on a p-value is a verdict that is oblivious to consequences. In my view, this disqualifies hypothesis testing, and p-values with it, from making rational decisions. Of course, the p-value could be supplemented by considerations of these consequences, although this is rarely done in a transparent manner. However, the two-step procedure of calculating the p-value and then incorporating the consequences is unlikely to match in its integrity the single-stage procedure in which we compare the expected losses associated with the two contemplated options.”

“At present, [Lindley’s] decision-theoretical approach is difficult to implement in practice. This is not because of any computational complexity or some problematic assumptions, but because of our collective reluctance to inquire about the consequences – about our clients’ priorities, remits and value judgements. Instead, we promote a culture of “objective” analysis, epitomised by the 5% threshold in significance testing. It corresponds to a particular balance of consequences, which may or may not mirror our clients’ perspective.”

“The p-value and statistical significance are at best half-baked products in the process of making decisions, and a distraction at worst, because the ultimate conclusion of a statistical analysis should be a proposal for what to do next in our clients’ or our own research, business, production or some other agenda. Let’s reflect and admit how frequently we abuse hypothesis testing by adopting (sometimes by stealth) the null hypothesis when we fail to reject it, and therefore do so without any evidence to support it. How frequently we report, or are party to reporting, the results of hypothesis tests selectively. The problem is not with our failing to adhere to the convoluted strictures of a popular method, but with the method itself. In the 1950s, it was a great statistical invention, and its popularisation later on a great scientific success. Alas, decades later, it is rather out of date, like the steam engine. It is poorly suited to the demands of modern science, business, and society in general, in which the budget and pocketbook are important factors.”

ABC at sea and at war

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , on July 18, 2017 by xi'an

While preparing crêpes at home yesterday night, I browsed through the  most recent issue of Significance and among many goodies, I spotted an article by McKay and co-authors discussing the simulation of a British vs. German naval battle from the First World War I had never heard of, the Battle of the Dogger Bank. The article was illustrated by a few historical pictures, but I quickly came across a more statistical description of the problem, which was not about creating wargames and alternate realities but rather inferring about the likelihood of the actual income, i.e., whether or not the naval battle outcome [which could be seen as a British victory, ending up with 0 to 1 sunk boat] was either a lucky strike or to be expected. And the method behind solving this question was indeed both Bayesian and ABC-esque! I did not read the longer paper by McKay et al. (hard to do while flipping crêpes!) but the description in Significance was clear enough to understand that the six summary statistics used in this ABC implementation were the number of shots, hits, and lost turrets for both sides. (The answer to the original question is that indeed the British fleet was lucky to keep all its boats afloat. But it is also unlikely another score would have changed the outcome of WWI.) [As I found in this other history paper, ABC seems quite popular in historical inference! And there is another completely unrelated arXived paper with main title The Fog of War…]

latest issue of Significance

Posted in Statistics with tags , , , , on March 20, 2017 by xi'an

The latest issue of Significance is bursting with exciting articles and it is a shame I do not receive it any longer (not that I stopped subscribing to the RSS or the ASA, but it simply does not get delivered to my address!). For instance, a tribune by Tom Nicolls (from whom I borrowed this issue for the weekend!) on his recent assessment of false positive in brain imaging [I covered in a blog entry a few months ago] when checking the cluster inference and the returned p-values. And the British equivalent of Gelman et al. book cover on the seasonality of births in England and Wales, albeit witout a processing of the raw data and without mention being made of the Gelmanesque analysis: the only major gap in the frequency is around Christmas and New Year, while there is a big jump around September (also there in the New York data).

birdfeedA neat graph on the visits to four feeders by five species of birds. A strange figure in Perils of Perception that [which?!] French people believe 31% of the population is Muslim and that they are lacking behind many other countries in terms of statistical literacy. And a rather shallow call to Popper to running decision-making in business statistics.

To predict and serve?

Posted in Books, pictures, Statistics with tags , , , , , , , , , , , on October 25, 2016 by xi'an

Kristian Lum and William Isaac published a paper in Significance last week [with the above title] about predictive policing systems used in the USA and presumably in other countries to predict future crimes [and therefore prevent them]. This sounds like a good idea for a science fiction plot, à la Philip K Dick [in his short story, The Minority Report], but that it is used in real life definitely sounds frightening, especially when the civil rights of the targeted individuals are impacted. (Although some politicians in different democratic countries increasingly show increasing contempt for keeping everyone’ rights equal…) I also feel terrified by the social determinism behind the very concept of predicting crime from socio-economic data (and possibly genetic characteristics in a near future, bringing us back to the dark days of physiognomy!)

“…crimes that occur in locations frequented by police are more likely to appear in the database simply because that is where the police are patrolling.”

Kristian and William examine in this paper one statistical aspect of the police forces relying on crime prediction software, namely the bias in the data exploited by the software and in the resulting policing. (While the accountability of the police actions when induced by such software is not explored, this is obviously related to the Nature editorial of last week, “Algorithm and blues“, which [in short] calls for watchdogs on AIs and decision algorithms.) When the data is gathered from police and justice records, any bias in checks, arrests, and condemnations will be reproduced in the data and hence will repeat the bias in targeting potential criminals. As aptly put by the authors, the resulting machine learning algorithm will be “predicting future policing, not future crime.” Worse, by having no reservation about over-fitting [the more predicted crimes the better], it will increase the bias in the same direction. In the Oakland drug-user example analysed in the article, the police concentrates almost uniquely on a few grid squares of the city, resulting into the above self-predicting fallacy. However, I do not see much hope in using other surveys and datasets towards eliminating this bias, as they also carry their own shortcomings. Even without biases, predicting crimes at the individual level just seems a bad idea, for statistical and ethical reasons.

a Simpson paradox of sorts

Posted in Books, Kids, pictures, R with tags , , , , , , , , , on May 6, 2016 by xi'an

The riddle from The Riddler this week is about finding an undirected graph with N nodes and no isolated node such that the number of nodes with more connections than the average of their neighbours is maximal. A representation of a connected graph is through a matrix X of zeros and ones, on which one can spot the nodes satisfying the above condition as the positive entries of the vector (X1)^2-(X^21), if 1 denotes the vector of ones. I thus wrote an R code aiming at optimising this target

targe <- function(F){

by mere simulated annealing:

rate <- function(N){ 
# generate matrix F
# 1. no single 
for (i in 2:(N-1)){ 
if (sum(F[,i])==0) 
if (sum(F[,N])==0) 
# 2. more connections 
#simulated annealing
for (t in 1:T){
  #1. local proposal
  while (min(prop%*%rep(1,N))==0){
  if (log(runif(1))*temp<target-targo){ 
#2. global proposal 
  prop=F prop[lower.tri(prop)]=F[lower.tri(prop)]+
  if (log(runif(1))*temp<target-targo){

Eward SimpsonThis code returns quite consistently (modulo the simulated annealing uncertainty, which grows with N) the answer N-2 as the number of entries above average! Which is rather surprising in a Simpson-like manner since all entries but two are above average. (Incidentally, I found out that Edward Simpson recently wrote a paper in Significance about the Simpson-Yule paradox and him being a member of the Bletchley Park Enigma team. I must have missed out the connection with the Simpson paradox when reading the paper in the first place…)

exoplanets at 99.999…%

Posted in Books, pictures, Statistics, University life with tags , , , , , on January 22, 2016 by xi'an

The latest Significance has a short article providing some coverage of the growing trend in the discovery of exoplanets, including new techniques used to detect those expoplanets from their impact on the associated stars. This [presumably] comes from the recent book Cosmos: The Infographics Book of Space [a side comment: new books seem to provide material for many articles in Significance these days!] and the above graph is also from the book, not the ultimate infographic representation in my opinion given that a simple superposition of lines could do as well. Or better.

¨A common approach to ruling out these sorts of false positives involves running sophisticated numerical algorithms, called Monte Carlo simulations, to explore a wide range of blend scenarios (…) A new planet discovery needs to have a confidence of (…) a one in a million chance that the result is in error.”

The above sentence is obviously of interest, first because the detection of false positives by Monte Carlo hints at a rough version of ABC to assess the likelihood of the observed phenomenon under the null [no detail provided] and second because the probability statement in the end is quite unclear as of its foundations… Reminding me of the Higgs boson controversy. The very last sentence of the article is however brilliant, albeit maybe unintentionaly so:

“To date, 1900 confirmed discoveries have been made. We have certainly come a long way from 1989.”

Yes, 89 down, strictly speaking!

the latest Significance: Astrostats, black swans, and pregnant drivers [and zombies]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , on February 4, 2015 by xi'an

Reading Significance is always an enjoyable moment, when I can find time to skim through the articles (before my wife gets hold of it!). This time, I lost my copy between my office and home, and borrowed it from Tom Nichols at Warwick with four mornings to read it during breakfast. This December issue is definitely interesting, as it contains several introduction articles on astro- and cosmo-statistics! One thing I had not noticed before is how a large fraction of the papers is written by authors of books, giving a quick entry or interview about their book. For instance, I found out that Roberto Trotta had written a general public book called the Edge of the Sky (All You Need to Know About the All-There-Is) which exposes the fundamentals of cosmology through the 1000 most common words in the English Language.. So Universe is replaced with All-There-Is! I can understand and to some extent applaud the intention, but it nonetheless makes for a painful read, judging from the excerpt, when researcher and telescope are not part of the accepted vocabulary. Reading the corresponding article in Significance let me a bit bemused at the reason provided for the existence of a multiverse, i.e., of multiple replicas of our universe, all with different conditions: multiplying the universes makes our more likely, while it sounds almost impossible on its own! This sounds like a very frequentist argument… and I am not even certain it would convince a frequentist. The other articles in this special astrostatistics section were of a more statistical nature, from estimating the number of galaxies to the chances of a big asteroid impact. Even though I found the graphical representation of the meteorite impacts in the past century because of the impact drawing in the background. However, when I checked the link to Carlo Zapponi’s website, I found the picture was a still of a neat animation of meteorites falling since the first report.

Continue reading