Archive for FiveThirtyEight

a one-chance meeting [puzzle]

Posted in Books, Kids, pictures, R with tags , , , , , , on March 6, 2018 by xi'an

This afternoon, I took a quick look at the current Riddler puzzle, which sums up as, given three points A, B, C, arbitrarily moving on a plane with a one-shot view of their respective locations, find a moving rule to bring the three together at the same point at the same time. And could not spot the difficulty.

The solution seems indeed obvious when expressed as above rather than in the tell-tale format of the puzzle. Since every triangle has a circumscribed circle, and all points on that circle are obviously at the same distance of the centre O, the three points have to aim at the centre O. Assuming they all move at the same velocity, they will reach O together…

The question gets a wee bit more interesting when the number of points with the same one-time one-shot view option grows beyond 3, as these points will almost surely not all lie on a single circumscribed circle. While getting them together can be done by (a) finding the largest circle going through 3  points and containing all others [in case there is no such circle, adding an artificial point solves the issue!], triplet on which one can repeat the above instructions to reach O, and (b) bringing all points inside the circle to meet with one of the three points [the closest] on its straight-line way to O, by finding a point on that line at equal distance from both, a subsidiary question is whether or not this is the fastest way. Presumably not.  (Again I may be missing one item of the puzzle.)

When experimenting with a short R code, I quickly figured out that the circumscribed circles associated with all triplets do not necessarily contain all points. The resolution of this difficulty is however straightforward as it suffices to add an artificial point by considering all circumcentres and their distances to the farthest point, minimising over these distances and adding the extra point at random over the circumference. As in the example below.Incidentally, it took me much longer to write this post than to solve the puzzle, as I was trying to use the R function, which supposedly turns a centre and a radius into the corresponding circle, but somehow misses its target by adapting the curve to the area being displayed. I am still uncertain of what the function means. I hence ended up writing a plain R function for this purpose:


cyclic riddle [not in NYC!]

Posted in Kids, R, Running, Travel with tags , , , , , , , , , , on December 29, 2017 by xi'an

In the riddle of this week on fivethirtyeight, the question is to find the average number of rounds when playing the following game: P=6 players sitting in a circle each have B=3 coins and with probability 3⁻¹ they give one coin to their right or left side neighbour, or dump the coin to the centre. The game ends when all coins are in the centre. Coding this experiment in R is rather easy

while (max(situz)>0){
 for (i in (1:P)[unz<1/3]) 
 for (i in (1:P)[unz>2/3])

resulting in an average of 15.58, but I cannot figure out (quickly enough) an analytic approach to the problem. (The fit above is to a Gamma(â-1,ĝ) distribution.)

In a completely unrelated aparté (aside), I read earlier this week that New York City had prohibited the use of electric bikes. I was unsure of what this meant (prohibited on sidewalks? expressways? metro carriages?) so rechecked and found that electric bikes are simply not allowed for use anywhere in New York City. Because of the potential for “reckless driving”. The target is apparently the high number of delivery e-bikes, but this ban sounds so absurd that I cannot understand it passed. Especially when De Blasio has committed the city to the Paris climate agreement after Trump moronically dumped it… Banning the cars would sound much more in tune with this commitment! (A further aparté is that I strongly dislike e-bikes, running on nuclear power plants,  especially when they pass me on sharp hills!, but they are clearly a lesser evil when compared with mopeds and cars.)

a quincunx on NBC

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , , , , on December 3, 2017 by xi'an

Through Five-Thirty-Eight, I became aware of a TV game call The Wall [so appropriate for Trumpian times!] that is essentially based on Galton’s quincunx! A huge [15m!] high version of Galton’s quincunx, with seven possible starting positions instead of one, which kills the whole point of the apparatus which is to demonstrate by simulation the proximity of the Binomial distribution to the limiting Normal (density) curve.

But the TV game has obvious no interest in the CLT, or in the Beta binomial posterior, only in a visible sequence of binary events that turn out increasing or decreasing the money “earned” by the player, the highest sums being unsurprisingly less likely. The only decision made by the player is to pick one of the seven starting points (meaning the outcome should behave like a weighted sum of seven Normals with drifted means depending on the probabilities of choosing these starting points). I found one blog entry analysing an “idiot” strategy of playing the game, but not the entire game. (Except for this entry on the older Plinko.) And Five-Thirty-Eight surprisingly does not get into the optimal strategies to play this game (maybe because there is none!). Five-Thirty-Eight also reproduces the apocryphal quote of Laplace not requiring this [God] hypothesis.

[Note: When looking for a picture of the Quincunx, I also found this desktop version! Which “allows you to visualize the order embedded in the chaos of randomness”, nothing less. And has even obtain a patent for this “visual aid that demonstrates [sic] a random walk and generates [re-sic] a bell curve distribution”…]

easy riddle

Posted in Books, Kids, R with tags , , , , , on July 12, 2017 by xi'an

From the current Riddler, a problem that only requires a few lines of code and a few seconds of reasoning. Or not.

N households each stole the earnings from one of the (N-1) other households, one at a time. What is the probability that a given household is not burglarised? And what are the expected final earnings of each household in the list, assuming they all start with $1?

The first question is close to Feller’s enveloppe problem in that


is close to exp(-1) for N large. The second question can easily be solved by an R code like

for (v in 1:M){
 while (min(abs(vole-(1:N)))==0)
 for (t in 1:N){

which returns a pretty regular exponential-like curve, although I cannot figure the exact curve beyond the third burglary. The published solution gives the curve

{\frac{N-2}{N-1}}^{999}\times 2+{\frac{1}{N-1}}^{t-1}\times{\frac{N-1}{N}}^{N-t}\times\frac{N}{N-1}

corresponding to the probability of never being robbed (and getting on average an extra unit from the robbery) and of being robbed only before robbing someone else (with average wealth N/(N-1)).

continental divide

Posted in Books, Kids, pictures, R with tags , , , , on May 19, 2017 by xi'an

While the Riddler puzzle this week was anticlimactic,  as it meant filling all digits in the above division towards a null remainder, it came as an interesting illustration of how different division is taught in the US versus France: when I saw the picture above, I had to go and check an American primary school on-line introduction to division, since the way I was taught in France is something like that

with the solution being that 12128316 = 124 x 97809… Solved by a dumb R exploration of all constraints:

for (y in 111:143)
for (z4 in 8:9)
for (oz in 0:999){
  if ((digz[2]==0)&(x>=1e7)&(x<1e8)){ 
   if ((digz[5]*y>=1e3)&(digz[4]*y<1e4) &(r1>9)&(r1<100)){ 
    if ((7*y>=1e2)&(7*y<1e3)&(r2>=1e2)&(r2<1e3)){     
     if ((digz[3]*y>=1e2)&(digz[3]*y<1e3)&(r3>9)&(r3<1e2)){
       if (r4<y) solz=rbind(solz,c(y,z,x))

Looking for a computer-free resolution, the constraints on z exhibited by the picture are that (a) the second digit is 0 and the fourth digit is 7.  Moreover, the first and fifth digits are larger than 7 since y times these digits is a four-digit number. Better, since the second subtraction from a three-digit number by 7y returns a three-digit number and the third subtraction from a four-digit number by ny returns a two-digit number, n is larger than 7 but less than the first and fifth digits. Ergo, z is necessarily 97809! Furthermore, 8y<10³ and 9y≥10³, which means 111<y<125. Plus the constraint that 1000-8y≤99 implies y≥112. Nothing gained there! This leaves 12 values of y to study, unless there is another restriction I missed…

“In short, the French presidential election is a mess”

Posted in Statistics with tags , , , , , , on April 23, 2017 by xi'an

Harry Enten (and not Nate Silver as reported by Le Monde) published yesterday a post on Five-Thirty-Eight about the unpredictability of the French elections. Which essentially states the obvious, namely that the four major candidates all stand a chance to make it to the runoff. (The post classifies Macron as a former left-wing socialist, which shows a glaring misunderstanding of the candidate or a massive divergence of what left-wing means between France and the USA.) The tribune states both that the polls could exhibit a bigger mistake than in the previous elections and that Le Pen score is unlikely to be underestimated, because voters are no longer shy to acknowledge they vote for a fascist candidate. One argument for the error in the polls is attributed to pollsters “herding” their results, i.e., shrinking the raw figures towards the global average taken over previous polls. A [rather reasonable] correction dismissed by Le Monde and French pollsters. While Enten argues that the variability of the percentages over fifty polls is too small to be plausible, assuming a Normal distribution that may not hold because French pollsters use quotas to build their polling population. In any case, this analysis, while cautious and reasonably so!, does not elaborate on the largest question mark, the elephant in the room, namely the percentage of abstentions today and their distribution among the political spectrum, which may eventually make the difference tonight. Indeed, “the bottom line is that we don’t know what’s going to happen on Sunday.” And it is definitely frightening!

optimultiplication [a riddle]

Posted in Books, Kids, R, Statistics with tags , , , , , on April 14, 2017 by xi'an

The riddle of this week is about an optimisation of positioning the four digits of a multiplication of two numbers with two digits each and is open to a coding resolution:

Four digits are drawn without replacement from {0,1,…,9}, one at a time. What is the optimal strategy to position those four digits, two digits per row, as they are drawn, toward minimising the average product?

Although the problem can be solved algebraically by computing E[X⁴|x¹,..] and E[X⁴X³|x¹,..]  I wrote three R codes to “optimise” the location of the first three digits: the first digit ends up as a unit if it is 5 or more and a multiple of ten otherwise, on the first row. For the second draw, it is slightly more variable: with this R code,

for (t in 1:N)
if (i<5){ #10*i

the resulting digit again ends up as a unit if it is 5 (except when x¹=7,8,9, where it is 4) or more and a multiple of ten otherwise, but on the second row. Except when x¹=0, x²=1,2,3,4, when they end up on the first row together, 0 obviously in front.

For the third and last open draw, there is only one remaining random draw, which mean that the decision only depends on x¹,x²,x³ and E[X⁴|x¹,x²,x³]=(45-x¹-x²-x³)/7. Attaching x³ to x² or x¹ will then vary monotonically in x³, depending on whether x¹>x² or x¹<x²:

if ((i<1)&(j<5)){ pos=c(10*comean+k,comean+10*k)}
if ((i<5)&(j>4)){ pos=c(100*i*comean+k*j,j*comean+100*i*k)}
if ((i>0)&(i<5)&(j<5)){ pos=c(i*comean+k*j,j*comean+i*k)} 
if ((i<7)&(i>4)&(j<5)){ pos=c(i*comean+100*k*j,j*comean+100*i*k)} 
if ((i<7)&(i>4)&(j>4)){ pos=c(i*comean+k*j,j*comean+i*k)}
if ((i>6)&(j<4)){ pos=c(i*comean+100*k*j,j*comean+100*i*k)} 
if ((i>6)&(j>3)){ pos=c(i*comean+k*j,j*comean+i*k)}

Running this R code for all combinations of x¹,x² shows that, except for the cases x¹≥5 and x²=0, for which x³ invariably remains in front of x¹, there are always values of x³ associated with each position.