the latest Significance: Astrostats, black swans, and pregnant drivers [and zombies]

Reading Significance is always an enjoyable moment, when I can find time to skim through the articles (before my wife gets hold of it!). This time, I lost my copy between my office and home, and borrowed it from Tom Nichols at Warwick with four mornings to read it during breakfast. This December issue is definitely interesting, as it contains several introduction articles on astro- and cosmo-statistics! One thing I had not noticed before is how a large fraction of the papers is written by authors of books, giving a quick entry or interview about their book. For instance, I found out that Roberto Trotta had written a general public book called the Edge of the Sky (All You Need to Know About the All-There-Is) which exposes the fundamentals of cosmology through the 1000 most common words in the English Language.. So Universe is replaced with All-There-Is! I can understand and to some extent applaud the intention, but it nonetheless makes for a painful read, judging from the excerpt, when researcher and telescope are not part of the accepted vocabulary. Reading the corresponding article in Significance let me a bit bemused at the reason provided for the existence of a multiverse, i.e., of multiple replicas of our universe, all with different conditions: multiplying the universes makes our more likely, while it sounds almost impossible on its own! This sounds like a very frequentist argument… and I am not even certain it would convince a frequentist. The other articles in this special astrostatistics section were of a more statistical nature, from estimating the number of galaxies to the chances of a big asteroid impact. Even though I found the graphical representation of the meteorite impacts in the past century because of the impact drawing in the background. However, when I checked the link to Carlo Zapponi’s website, I found the picture was a still of a neat animation of meteorites falling since the first report.

“Taleb himself, once described as a philosopher, now self-identifies as a statistician. And, intrinsically, anti-fragility and statistical thinking are interrelated.” T. Bendell

Two rather superfluous [in my opinion] articles dealt with a regression of zombie google entries associated with each U.S. state—written by Daniel Zelterman, in connection with his chapter in the book Mathematical Modelling of Zombies), where I discovered the unexpected name of Mark Girolami [as a writer, not as a zombie cyclist!]—and something about X’mas crackers I have read further than the title. Yet another entry related with a book was Tony Bendell’s discussion of his recent book on Building anti-fragile organisations, written in the wake of Taleb’s book. Antifragile. (Reviewed by Larry Wasserman on the now defunct Normal Deviate.)

And I have not mentioned pregnant drivers yet: one entry was by two Canadian epidemiologists who studied the accident rate of pregnant women and concluded at an increased risk during pregnancy. I did not read the original paper so cannot make an informed comment, but still wonder at the possible impact of a higher tendency for pregnant women to be sent to hospital in case of a minor car accident. There could also be other confounding factors, like an increased mileage during pregnancy (certainly when compared with immediately after). And, since the study covers only women who completed their pregnancy and were still alive one year later, it excludes those who had severe or fatal crashes before starting a pregnancy or during their pregnancy. Another possible caveat is that, due to the rather limited length of the study, there may be an impact of the years of observation on the observed rise. This data is taken from Ontario, where Winter may be rather fierce!, and corrections for both seasonality and general number of crashes should have been considered.

3 Responses to “the latest Significance: Astrostats, black swans, and pregnant drivers [and zombies]”

  1. As for the accident rates of pregnant women, I originally worked with the first author, Donald A. Redelmeier, trying to design a study on cell phones and driving (199?). I suggested randomly dialing drivers on their cell phones and comparing their accident rates to those randomly not dialed.

    But seriously, he decided on self controls (comparing earlier periods to the period the accident occurred in) and recruited Rob Tibshirani to do the statistical analysis and be the co-author. I (privately) reviewed the work for Donald (as he was a fellow of my director) and he is aware of the challenges but I think he was always a bit too positive about the approach.

    I am not going to read the current work, it seems to be an n’th kick at the same type of design. It won’t be naive though not fully convincing but unlikely others will be able to improve on it. One has to accept unquantified inference risks if you don’t/can’t randomise.

  2. “This sounds like a very frequentist argument… and I am not even certain it would convince a frequentist.”

    Regardless of whether it sounds or would convince a frequentist, frequentist philosophy of statistics certainly spawned it.

    At heart frequentism is the claim that the product rule P(A & B) = P(A|B)P(B) only applies for some well defined A’s and B’s but not others. It applies (supposedly) to something they call “random variables”. Without that restriction, you immediately get full Bayes whether you like it or not.

    There is nothing in the mathematics to suggest, let alone require, such a restriction. Moreover, Frequentists have never given an objective operational definition of “random variable” which can be used to decide what’s an RV and what isn’t.

    Next week’s stock prices are conventionally treated a “random variables” and frequentists will assign probabilities to them. The outcome of the Presidential election in 2016 is somehow not a random variable and so frequentists wont assign a probability to it like Nate Silver successfully did in the 2012 election.

    Neither next week’s stock prices or the outcome of 2016 election are repeatable. Frequentist just seem to have an undefinable intuition, which changes over time, and differs from one to the next, about which variables are RV’s.

    The advantage of the multiverse for frequentists is that many formerly “non-repeatable” variables suddenly become “random variables”, and frequentists can magically assign probabilities to them. The variables haven’t physically changed in any way. All that’s happened is that frequentists invented a wild fiction which instantly makes them comfortable using the product rule.

    And all of this rigmarole, together with the other millions of man-hours wasted deny the applicability of the product rule, is done for no other purpose than to save Frequentists the trouble of thinking hard about what P(A) and P(B) might mean in general.

    What a sad, stupid way to do science.

