Archive for awful graphs

terrible graph, again

Posted in Statistics with tags , , , , on September 3, 2018 by xi'an

graph of the day & AI4good versus AI4bad

Posted in Books, pictures, Statistics with tags , , , , , , , , on July 15, 2018 by xi'an

Apart from the above graph from Nature, rendering in a most appalling and meaningless way the uncertainty about the number of active genes in the human genome, I read a couple of articles in this issue of Nature relating to the biases and dangers of societal algorithms. One of which sounded very close to the editorial in the New York Times on which Kristian Lum commented on this blog. With the attached snippet on what is fair and unfair (or not).

The second article was more surprising as it defended the use of algorithms for more democracy. Nothing less. Written by Wendy Tam Cho, professor of political sciences, law, statistics, and mathematics at UIUC, it argued that the software that she develops to construct electoral maps produces fair maps. Which sounds over-rosy imho, as aiming to account for all social, ethnic, income, &tc., groups, i.e., most of the axes that define a human, is meaningless, if only because the structure of these groups is not frozen in time. To state that “computers are impervious to the lure of power” is borderline ridiculous, as computers and algorithms are [so far] driven by humans. This is not to say that gerrymandering should not be fought by technological means, especially and obviously by open source algorithms, as existing proposals (discussed here) demonstrate, but to entertain the notion of a perfectly representative redistricting is not only illusory, but also far from democratic as it shies away from the one person one vote  at the basis of democracy. And the paper leaves us on the dark as to whom will decide on which group or which characteristic need be represented in the votes. Of course, this is the impression obtained by reading a one page editorial in Nature [in an overcrowded and sweltering commuter train] rather than the relevant literature. Nonetheless, I remain puzzled at why this editorial was ever published. (Speaking of democracy, the issue contains also warning reports about Hungary’s ultra-right government taking over the Hungarian Academy of Sciences.)

another terrible graph [about Neanderthalia]

Posted in Kids, pictures, Statistics with tags , , , , on March 3, 2018 by xi'an

Another terrible graph that misses by several orders of magnitude the terrible singularity of the United States in terms of homicides by firearms. And that does not include either the other [dove]tail, like the United Kingdom (0.7 per million) and Japan (0.1 per million).

bad graphics and poor statistics

Posted in Statistics with tags , , , , , , , , on February 21, 2018 by xi'an

Reading through The Guardian website, I came across this terrible graphic about US airlines 2016 comparison for killing pests pets they carry. Beyond the gross imprecision resulting from resorting to a (gross) dead dog scale to report integers, the impression of Hawaiian Airlines having a beef with pets is just misleading: there were three animal deaths on this company for that year. And nine on United Airlines (including the late giant rabbit). The law of small numbers in action! Computing a basic p-value (!) based on a Poisson approximation (the most pet friendly distribution) does not even exclude Hawaiian Airlines. Without even considering the possibility that, among the half-million plus pets travelling on US airlines in 2016, some would have died anyway but it happened during a flight. (As a comparison, there are “between 114 and 360 medical” in-flight [human] deaths per year. For it’s worth.) The scariest part of The Guardian article [beyond the reliance on terrible graphs!] is the call to end up pets travelling as cargo, meaning they would join their owner in the cabin. As if stag and hen [parties] were not enough of a travelling nuisance..!

another terrible graph

Posted in Books, pictures, Statistics with tags , , , on January 18, 2015 by xi'an

Le Monde illustrated an article about discriminations against women with this graph which gives the number of men for 100 women per continent. This is a fairly poor graph, fit for one of Tufte’s counterexamples, as the bars are truncated at 85, make little sense as they do not convey the time dimension, are dwarfed by the legend on the left that is not of the same colors, and also miss the population dimension, which makes the title inappropriate since the graph does not show why there are more men than women on the planet, even if the large percentage of the population of Asia in the World’s population hints at the result.

A dubious statistic

Posted in Books, R, Statistics with tags , , on June 1, 2011 by xi'an

Following a link on R-bloggers, I ended up on this page (with a completely useless graph that only contained the pieces of information 5% in 1900 and 55% in 2000). The author (Ralph Keeney) reports on “A remarkable 55 percent of deaths for people age 15 to 64 can be attributed to decisions with readily available alternatives.” This sounded to me like a highly dubious finding… So I looked at the paper itself, reading that

“A personal decision is a situation where an individual can make a choice among two or more alternatives. This assumes that the individual recognizes that he or she has a choice and has control of this choice. Readily available alternatives are alternatives that the decision maker would have known about and could have chosen without investing much time or money.” Ralph Keeney

This categorisation of deaths is highly debatable, in that choice is not always that available! So I do not see how the author can assert which percentage of the individuals truly have control of the choice… (For instance, can people refuse doing dangerous jobs when they desperately need a job? or when the dangerousness is an abstract concept as, say, for a Fukushima worker? Is obesity a sheer matter of will?) Furthermore, the jump from 5% to 55% is also highly shaky: “Clearly, one should not put much credibility in this 22% for 1950 or the corresponding 5% for 1900”.  In the end, tt seems that the whole issue of the paper is about the amount of information: “in 1900 the knowledge about and ability to avoid many of the causes of death would seem to be much lower than in 2000”. So life has not been getting more dangerous or people sillier, simply information about the causes of deaths has become more widespread. I am thus surprised at the low level of academic input contained in the paper (look at the “life-saving decisions’!), which may actually explain for the echo it found on the blogosphere. (This post also appeared on the Statistics Forum.)

Stacking up chores, haphazardly…

Posted in Statistics with tags , on August 2, 2009 by xi'an

A fairly awful “statistical” graph appeared in the Business section of the Sunday New York Times, today. It slices the day chores of a population along the day by percentages, which means that an erratic dynamics for one activity (sport, say) impacts all activities of top. This is one problem. Another one is that we naturally react to surface representation more than to slices, so the global information of how the day is spent. Since activities are piled up, comparing two populations with respect to one single activity is almost impossible, except for sleep which is the bottom activity. This is not mentioning the side effects of how the questions were asked to the participants, how non-response was processed and so on… And the biases resulting from a raw interpretation like “On an average weekday, the unemployed sleep an hour more than their employed peers” which does not account for the difference in the categories, like the fact that people may be unemployed because they are sick and other side effects.