**I**n the latest Significance, there was a flyer with some members updates, an important one being that Sylvia Richardson had been elected the next president of the Royal Statistical Society. Congratulations to my friend Sylvia! Another item was that the publication of the 2018 RSS Statistic of the Year has led an Australian water company to switch from plastic to aluminum. Hmm, what about switching to nothing and supporting a use-your-own bottle approach? While it is correct that aluminum cans can be 100% made of recycled aluminum, this water company does not seem to appear to make any concerted effort to ensure its can are made of recycled aluminum or to increase the recycling rate for aluminum in Australia towards achieving those of Brazil (92%) or Japan (86%). (Another shocking statistic that could have been added to the 90.5% non-recycled plastic waste [in the World?] is that a water bottle consumes the equivalent of one-fourth of *its contents in oil* to produce.) Another US water company still promotes water bottles as one of the most effective and inert carbon capture & sequestration methods”..! There is no boundary for green-washing.

## Archive for Significance

## a statistic with consequences

Posted in pictures, Statistics with tags aluminum, Australia, green-washing, marketing, plastic waste, Royal Statistical Society, Significance, Sylvia Richardson on July 18, 2019 by xi'an## impossible estimation

Posted in Books, Statistics with tags algorithmic policing, anthropology, Congo, DRC, excess deaths, France, Kivu, Le Monde, Significance, societal statistics, Syria, Syrian civil war on January 29, 2018 by xi'an**O**utside its Sciences & Médecine section that I most often read, Le Monde published last weekend a tribune by the anthropologist Michel Naepels [who kindly replied to my email on his column] on the impossibility to evaluate the number of deaths in Congo due to the political instability (a weak and undemocratic state fighting armed rebel groups), for lack of a reliable sample. With a huge gap between two estimations of this number, from 200,000 to 5.4 million excess deaths. In the later, IRC states that “only 0.4 percent of all deaths across DR Congo were attributed directly to violence”. Still, diverging estimates do not mean numbers are impossible to produce, just that more elaborate methods like those developed by Rebecca Steorts for Syrian deaths must be investigated. Which requires more means than those available to the local States (assuming they are interested in the answer) or to NGOs. This also raises the question whether or not “excess deaths” has an absolute meaning, since it refers to an hypothetical state of the world that has not taken place.

On the same page, another article written by geographers shed doubt on predictive policing software, not used in France, if not so clearly as in the Significance article by Kristian Lum and William Isaac last year.

## p-values and decision-making [reposted]

Posted in Books, Statistics, University life with tags 0.005, 0.05, books, decision theory, Dennis Lindley, hypothesis testing, Nicholas T. Longford, p-values, Robert Matthews, Significance, statistical significance on August 30, 2017 by xi'an*I**n a letter to Significance about a review of Robert Matthews’s book, Chancing it, Nicholas Longford recalls a few basic facts about p-values and decision-making earlier made by Dennis Lindley in Making Decisions. Here are some excerpts, worth repeating in the light of the 0.005 proposal:*

“A statement of significance based on a p-value is a verdict that is oblivious to consequences. In my view, this disqualifies hypothesis testing, and p-values with it, from making rational decisions. Of course, the p-value could be supplemented by considerations of these consequences, although this is rarely done in a transparent manner. However, the two-step procedure of calculating the p-value and then incorporating the consequences is unlikely to match in its integrity the single-stage procedure in which we compare the expected losses associated with the two contemplated options.”

“At present, [Lindley’s] decision-theoretical approach is difficult to implement in practice. This is not because of any computational complexity or some problematic assumptions, but because of our collective reluctance to inquire about the consequences – about our clients’ priorities, remits and value judgements. Instead, we promote a culture of “objective” analysis, epitomised by the 5% threshold in significance testing. It corresponds to a particular balance of consequences, which may or may not mirror our clients’ perspective.”

“The p-value and statistical significance are at best half-baked products in the process of making decisions, and a distraction at worst, because the ultimate conclusion of a statistical analysis should be a proposal for what to do next in our clients’ or our own research, business, production or some other agenda. Let’s reflect and admit how frequently we abuse hypothesis testing by adopting (sometimes by stealth) the null hypothesis when we fail to reject it, and therefore do so without any evidence to support it. How frequently we report, or are party to reporting, the results of hypothesis tests selectively. The problem is not with our failing to adhere to the convoluted strictures of a popular method, but with the method itself. In the 1950s, it was a great statistical invention, and its popularisation later on a great scientific success. Alas, decades later, it is rather out of date, like the steam engine. It is poorly suited to the demands of modern science, business, and society in general, in which the budget and pocketbook are important factors.”

## ABC at sea and at war

Posted in Books, pictures, Statistics, Travel with tags ABC, Approximate Bayesian computation, Battle of the Dogger Bank, counterfactuals, crêpes, first World War, history, Jutland, naval battle, Significance, The Fog of War, wargame on July 18, 2017 by xi'an**W**hile preparing crêpes at home yesterday night, I browsed through the most recent issue of Significance and among many goodies, I spotted an article by McKay and co-authors discussing the simulation of a British vs. German naval battle from the First World War I had never heard of, the Battle of the Dogger Bank. The article was illustrated by a few historical pictures, but I quickly came across a more statistical description of the problem, which was not about creating wargames and alternate realities but rather inferring about the likelihood of the actual income, i.e., whether or not the naval battle outcome [which could be seen as a British victory, ending up with 0 to 1 sunk boat] was either a lucky strike or to be expected. And the method behind solving this question was indeed both Bayesian and ABC-esque! I did not read the longer paper by McKay et al. (hard to do while flipping crêpes!) but the description in Significance was clear enough to understand that the six summary statistics used in this ABC implementation were the number of shots, hits, and lost turrets for both sides. (The answer to the original question is that indeed the British fleet was lucky to keep all its boats afloat. But it is also unlikely another score would have changed the outcome of WWI.) [As I found in this other history paper, ABC seems quite popular in historical inference! And there is another completely unrelated arXived paper with main title The Fog of War…]

## latest issue of Significance

Posted in Statistics with tags Bayesian data analysis, birthrate, Karl Popper, Royal Statistical Society, Significance on March 20, 2017 by xi'an**T**he latest issue of Significance is bursting with exciting articles and it is a shame I do not receive it any longer (not that I stopped subscribing to the RSS or the ASA, but it simply does not get delivered to my address!). For instance, a tribune by Tom Nicolls (from whom I borrowed this issue for the weekend!) on his recent assessment of false positive in brain imaging [I covered in a blog entry a few months ago] when checking the cluster inference and the returned p-values. And the British equivalent of Gelman et al. book cover on the seasonality of births in England and Wales, albeit witout a processing of the raw data and without mention being made of the Gelmanesque analysis: the only major gap in the frequency is around Christmas and New Year, while there is a big jump around September (also there in the New York data).

A neat graph on the visits to four feeders by five species of birds. A strange figure in Perils of Perception that [which?!] French people believe 31% of the population is Muslim and that they are lacking behind many other countries in terms of statistical literacy. And a rather shallow call to Popper to running decision-making in business statistics.

## a Simpson paradox of sorts

Posted in Books, Kids, pictures, R with tags Bletchley Park, Edward Simpson, Enigma code machine, graph, mathematical puzzle, Significance, Simpson's paradox, simulated annealing, The Riddler, Yule on May 6, 2016 by xi'an**T**he riddle from The Riddler this week is about finding an undirected graph with N nodes and no isolated node such that the number of nodes with more connections than the average of their neighbours is maximal. A representation of a connected graph is through a matrix X of zeros and ones, on which one can spot the nodes satisfying the above condition as the positive entries of the vector (X**1**)^2-(X^2**1**), if **1** denotes the vector of ones. I thus wrote an R code aiming at optimising this target

targe <- function(F){ sum(F%*%F%*%rep(1,N)/(F%*%rep(1,N))^2<1)}

by mere simulated annealing:

rate <- function(N){ # generate matrix F # 1. no single F=matrix(0,N,N) F[sample(2:N,1),1]=1 F[1,]=F[,1] for (i in 2:(N-1)){ if (sum(F[,i])==0) F[sample((i+1):N,1),i]=1 F[i,]=F[,i]} if (sum(F[,N])==0) F[sample(1:(N-1),1),N]=1 F[N,]=F[,N] # 2. more connections F[lower.tri(F)]=F[lower.tri(F)]+ sample(0:1,N*(N-1)/2,rep=TRUE,prob=c(N,1)) F[F>1]=1 F[upper.tri(F)]=t(F)[upper.tri(t(F))] #simulated annealing T=1e4 temp=N targo=targe(F) for (t in 1:T){ #1. local proposal nod=sample(1:N,2) prop=F prop[nod[1],nod[2]]=prop[nod[2],nod[1]]= 1-prop[nod[1],nod[2]] while (min(prop%*%rep(1,N))==0){ nod=sample(1:N,2) prop=F prop[nod[1],nod[2]]=prop[nod[2],nod[1]]= 1-prop[nod[1],nod[2]]} target=targe(prop) if (log(runif(1))*temp<target-targo){ F=prop;targo=target} #2. global proposal prop=F prop[lower.tri(prop)]=F[lower.tri(prop)]+ sample(c(0,1),N*(N-1)/2,rep=TRUE,prob=c(N,1)) prop[prop>1]=1 prop[upper.tri(prop)]=t(prop)[upper.tri(t(prop))] target=targe(prop) if (log(runif(1))*temp<target-targo){ F=prop;targo=target} temp=temp*.999 } return(F)}

This code returns quite consistently (modulo the simulated annealing uncertainty, which grows with N) the answer N-2 as the number of entries above average! Which is rather surprising in a Simpson-like manner since all entries but two are above average. (Incidentally, I found out that Edward Simpson recently wrote a paper in Significance about the Simpson-Yule paradox and him being a member of the Bletchley Park Enigma team. I must have missed out the connection with the Simpson paradox when reading the paper in the first place…)