Archive for FiveThirtyEight

distracting redistricting?

Posted in Books, Statistics with tags , , , , , , , , , on August 26, 2021 by xi'an

“We at FiveThirtyEight will be tracking the whole redistricting process, from proposed maps to final maps, so watch this space for updates!”

FiveThirtyEight is keeping a tracker on the “redistricting” of U.S. states, namely the decennial redrawing of electoral districts. This is still an early stage when no map has been validated by the state legislature and hence I cannot tell whether or not FiveThirtyEight will be analysing gerrymandering in a statistical manner, to figure out how extreme the map is within the collection of all electoral maps. The States being the States, the rules vary widely between them, from the legislators themselves setting the boundaries (while sometimes being very open on their intentions to favour their own side) to independent commissions being in charge. I did not spot any clear involvement of statisticians in the process.

“The application of differential privacy will bring significant harm to Alabama (…) The Census Bureau has not shown that other disclosure avoidance methods
would not satisfy the privacy requirements
” Case No. 3:21-cv-00211

While looking at this highly informative webpage maintained by University of Colorado Law School Doug Spencer, I came across this federal court challenge by the State of Alabama again the Census Bureau for using differential privacy! A statistical version of “shoot the messenger”?! The legal argument of the State is “the Fifth Amendment, alleging that differential privacy is a violation of the one-person, one-vote principle and will result in the dilution of their votes.” I however wonder what is the genuine (political) reason for this challenge!

top of the top

Posted in Statistics with tags , , , on August 19, 2021 by xi'an

An easy-peasy riddle from The Riddler about the probability that a random variable is the largest among ten iid variates, conditional on the event that this random variable is larger than the upper decile. This writes down easily as

10\int_{q_{90}}^\infty F^9(x) f(x)\,\text d x

if F and f are the cdf and pmf, respectively, which is equal to 1-.9¹⁰, approximately 1-e⁻¹, no matter what F is….

multinomial but unique

Posted in Kids, R, Statistics with tags , , , , , , on July 16, 2021 by xi'an

A quick riddle from the Riddler, where the multinomial M(n¹,n²,100-n¹-n²) probability of getting three different labels out of three possible ones out of three draws is 20%, inducing a single possible value for (n¹,n²) up to a permutation.

Since this probability is n¹n²(100-n¹-n²)/161,700, there indeed happens to be only one decomposition of 32,340 as 21 x 35 x 44. The number of possible values for the probability is actually 796, with potential large gaps between successive values of n¹n²(100-n¹-n²) as shown by the above picture.

one-way random walks

Posted in Kids, R, Statistics with tags , , , on May 2, 2021 by xi'an

A rather puzzling riddle from The Riddler on an 3×3 directed grid and the probability to get from the North-West to the South-East nodes following the arrows. Puzzling because while the solution could be reasonably computed with an R code like

for(i in 1:2^12){
  for(j in 1:12)sol=max(sol,

where paz is the list of the 12 possible paths from North-West to South-East (excluding loops!), leading to a probability of 1135/2¹², I could not find a logical reasoning to reach this number. The paths of length 4, 6, 8 are valid in 2⁸, 2⁶, 2⁴ of the cases, respectively and logically!, but this does not help as they are dependent.

Take thrice thy money, bid me tear the bond

Posted in Books, Kids with tags , , , , , on April 21, 2021 by xi'an

A rather fun riddle for which the pen&paper approach proved easier than coding in R, for once. It goes as follows: starting with one Euro, one sequentially predicts a sequence of four random bits, betting an arbitrary fraction of one’s money at each round. When winning, the bet is doubled, otherwise, it is lost. Under the information that the same bit cannot occur thrice in a row, what is the optimal minimax gain?

Three simplifications: (i) each bet is a fraction ε of the current fortune of the player, which appears as a product of (1±ε) the previous bets (ii) when the outcome is 0 or 1, this fraction ε can thus be chosen in (-1,1), (iii) while the optimal choice is ε=1 when the outcome is known, i.e., when both previous are identical. The final fortune of the player is thus of the form


if the outcome is alternating (e.g., 0101 or 0100), while it is of the form


if there are two identical successive bits in the first three results (e.g., 1101 or 0110). When choosing each of the fractions ε, the minimum final gain must be maximised. This implies that ε=0 for the bet on the final bit  when the outcome is uncertain (and ε=1 otherwise). In case of an alternating début, like 01, the minimal gain is


which is maximised by ε”=1/3, taking the objective value 4(1±ε)(1±ε’)/3. Leading to the gain after the first bit being


which is maximised by ε’=1/5, for the objective value 8(1±ε)/5. By symmetry, the optimal choice is ε=0. Which ends up with a minimax gain of 3/5. [The quote is from Shakespeare, in the Merchant of Venice.]