## down with Galton (and Pearson and Fisher…)

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , on July 22, 2019 by xi'an

In the last issue of Significance, which I read in Warwick prior to the conference, there is a most interesting article on Galton’s eugenics, his heritage at University College London (UCL), and the overall trouble with honouring prominent figures of the past with memorials like named building or lectures… The starting point of this debate is a protest from some UCL students and faculty about UCL having a lecture room named after the late Francis Galton who was a professor there. Who further donated at his death most of his fortune to the university towards creating a professorship in eugenics. The protests are about Galton’s involvement in the eugenics movement of the late 18th and early 19th century. As well as professing racist opinions.

My first reaction after reading about these protests was why not?! Named places or lectures, as well as statues and other memorials, have a limited utility, especially when the named person is long dead and they certainly do not contribute in making a scientific theory [associated with the said individual] more appealing or more valid. And since “humans are [only] humans”, to quote Stephen Stigler speaking in this article, it is unrealistic to expect great scientists to be perfect, the more if one multiplies the codes for ethical or acceptable behaviours across ages and cultures. It is also more rational to use amphitheater MS.02 and lecture room AC.18 rather than associate them with one name chosen out of many alumni’s or former professors’.

Predictably, another reaction of mine was why bother?!, as removing Galton’s name from the items it is attached to is highly unlikely to change current views on eugenism or racism. On the opposite, it seems to detract from opposing the present versions of these ideologies. As some recent proposals linking genes and some form of academic success. Another of my (multiple) reactions was that as stated in the article these views of Galton’s reflected upon the views and prejudices of the time, when the notions of races and inequalities between races (as well as genders and social classes) were almost universally accepted, including in scientific publications like the proceedings of the Royal Society and Nature. When Karl Pearson launched the Annals of Eugenics in 1925 (after he started Biometrika) with the very purpose of establishing a scientific basis for eugenics. (An editorship that Ronald Fisher would later take over, along with his views on the differences between races, believing that “human groups differ profoundly in their innate capacity for intellectual and emotional development”.) Starting from these prejudiced views, Galton set up a scientific and statistical approach to support them, by accumulating data and possibly modifying some of these views. But without much empathy for the consequences, as shown in this terrible quote I found when looking for more material:

“I should feel but little compassion if I saw all the Damaras in the hand of a slave-owner, for they could hardly become more wretched than they are now…”

As it happens, my first exposure to Galton was in my first probability course at ENSAE when a terrific professor was peppering his lectures with historical anecdotes and used to mention Galton’s data-gathering trip to Namibia, literally measure local inhabitants towards his physiognomical views , also reflected in the above attempt of his to superpose photographs to achieve the “ideal” thief…

## a quincunx on NBC

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , , , , on December 3, 2017 by xi'an

Through Five-Thirty-Eight, I became aware of a TV game call The Wall [so appropriate for Trumpian times!] that is essentially based on Galton’s quincunx! A huge [15m!] high version of Galton’s quincunx, with seven possible starting positions instead of one, which kills the whole point of the apparatus which is to demonstrate by simulation the proximity of the Binomial distribution to the limiting Normal (density) curve.

But the TV game has obvious no interest in the CLT, or in the Beta binomial posterior, only in a visible sequence of binary events that turn out increasing or decreasing the money “earned” by the player, the highest sums being unsurprisingly less likely. The only decision made by the player is to pick one of the seven starting points (meaning the outcome should behave like a weighted sum of seven Normals with drifted means depending on the probabilities of choosing these starting points). I found one blog entry analysing an “idiot” strategy of playing the game, but not the entire game. (Except for this entry on the older Plinko.) And Five-Thirty-Eight surprisingly does not get into the optimal strategies to play this game (maybe because there is none!). Five-Thirty-Eight also reproduces the apocryphal quote of Laplace not requiring this [God] hypothesis.

[Note: When looking for a picture of the Quincunx, I also found this desktop version! Which “allows you to visualize the order embedded in the chaos of randomness”, nothing less. And has even obtain a patent for this “visual aid that demonstrates [sic] a random walk and generates [re-sic] a bell curve distribution”…]

## another Galton-Watson riddle

Posted in Statistics with tags , , , on February 3, 2017 by xi'an

The riddle on the Riddler this week is definitely a classic, since it rephrases the standard Galton-Watson branching process (which should have been called Bienaymé‘s process, as he established the relation before Watson, while the jack-of-all-trades Francis Galton only posed the question):

At the beginning, there is a single microorganism. Each day, every member of this species either splits into two copies of itself or dies. If the probability of multiplication is p, what are the chances that this species goes extinct?

As is easily seen from the moment generating function, the species goes instinct if p≤½. Actually, I always found it intriguing [intuitively] that the value ½ is included in the exclusion range!

## a Galton-Watson riddle

Posted in R, Travel with tags , , , on December 30, 2016 by xi'an

The Riddler of this week has an extinction riddle which summarises as follows:

One observes a population of N individuals, each with a probability of 10⁻⁴ to kill the observer each day. From one day to the next, the population decreases by one individual with probability

K√N 10⁻⁴

What is the value of K that leaves the observer alive with probability ½?

Given the sequence of population sizes N,N¹,N²,…, the probability to remain alive is

$(1-10^{-4})^{N+N^1+\ldots}$

where the sum stops with the (sure) extinction of the population. Which is the moment generating function of the sum. At x=1-10⁻⁴. Hence the problem relates to a Galton-Watson extinction problem. However, given the nature of the extinction process I do not see a way to determine the distribution of the sum, except by simulation. Which returns K=27 for the specific value of N=9.

N=9
K=3*N
M=10^4
vals=rep(0,M)
targ=0
ite=1
while (abs(targ-.5)>.01){

for (t in 1:M){
gen=vals[t]=N
while (gen>0){
gen=gen-(runif(1)<K*sqrt(gen)/10^4)
vals[t]=vals[t]+gen}
}
targ=mean(exp(vals*log(.9999)))
print(c(as.integer(ite),K,targ))
if (targ<.5){ K=K*ite/(1+ite)}else{
K=K/(ite/(1+ite))}
ite=ite+1}


The solution proposed on The Riddler is more elegant in that the fixed point equation is

$\prod_{J=1}^9 \frac{K \cdot \sqrt{J}}{K \cdot \sqrt{J} + J}=\frac{1}{2}$

with a solution around K=27.

## Tractable Fully Bayesian inference via convex optimization and optimal transport theory

Posted in Books, Statistics, University life with tags , , , , , , , , on October 6, 2015 by xi'an

“Recently, El Moselhy et al. proposed a method to construct a map that pushed forward the prior measure to the posterior measure, casting Bayesian inference as an optimal transport problem. Namely, the constructed map transforms a random variable distributed according to the prior into another random variable distributed according to the posterior. This approach is conceptually different from previous methods, including sampling and approximation methods.”

Yesterday, Kim et al. arXived a paper with the above title, linking transport theory with Bayesian inference. Rather strangely, they motivate the transport theory with Galton’s quincunx, when the apparatus is a discrete version of the inverse cdf transform… Of course, in higher dimensions, there is no longer a straightforward transform and the paper shows (or recalls) that there exists a unique solution with positive Jacobian for log-concave posteriors. For instance, log-concave priors and likelihoods. This solution remains however a virtual notion in practice and an approximation is constructed via a (finite) functional polynomial basis. And minimising an empirical version of the Kullback-Leibler distance.

I am somewhat uncertain as to how and why apply such a transform to simulations from the prior (which thus has to be proper). Producing simulations from the posterior certainly is a traditional way to approximate Bayesian inference and this is thus one approach to this simulation. However, the discussion of the advantage of this approach over, say, MCMC, is quite limited. There is no comparison with alternative simulation or non-simulation methods and the computing time for the transport function derivation. And on the impact of the dimension of the parameter space on the computing time. In connection with recent discussions on probabilistic numerics and super-optimal convergence rates, Given that it relies on simulations, I doubt optimal transport can do better than O(√n) rates. One side remark about deriving posterior credible regions from (HPD)  prior credible regions: there is no reason the resulting region is optimal in volume (HPD) given that the transform is non-linear.

## the quincunx [book review]

Posted in Books, Kids, Statistics with tags , , , , , on July 1, 2013 by xi'an

“How then may we become free? Only by harmonising ourselves with the randomness of life through the untrammelled operation of the market.”

This is a 1989 book that I read about that time and had not re-read till last month…. The Quincunx is a parody of several of Charles Dickens’ novels, written by another Charles, Charles Palliser, far into the 20th Century. The name is obviously what attracted me first to this book, since it reminded me of Francis Galton’s amazing mechanical simulation device. Of course, there is nothing in the book that relates to Galton and its quincunx!

“Your employer has been speculating in bills with the company’s capital and, as you’ll conclude in the present panic, he has lost heavily. There’s no choice now but to declare the company bankrupt. And when that happens, the creditors will put you in Marshalsea.”

As I am a big fan of Dickens, I went through The Quincunx as an exercise in Dickensania, trying to spot characters and settings from the many books written by Dickens. I found connections with Great Expectations (for the John-Henrietta couple and the fantastic features in the thieves’ den, but also encounters with poverty and crime), Bleak House (for the judicial intricacies), Little Dorrit (for the jail system and the expectation of inheritance), Our Mutual Friend (for the roles of the Thames, of money, forced weddings),  Martin Chuzzlewit (again for complex inheritance stories), Oliver Twist (for the gangs of thieves, usury, the private “schools” and London underworld), David Copperfield (for the somehow idiotic mother and the fall into poverty), The Mystery of Edwin Drood (for the murder, of course!) And I certainly missed others. (Some literary critics wrote that Palliser managed to write all Dickens at once.)

“I added to the mixture a badly bent George II guinea which was the finest of all the charms.”

However, despite the perfect imitation in style, with its array of grotesque characters and unbelievable accidents, using Dickens’ irony and tongue-in-cheek circumlocutions, with maybe an excess of deliberate misspellings, Palliser delivers a much bleaker picture of Dickens’ era than Dickens himself. This was the worst of times, if any, where some multifaceted unbridled capitalism makes use of the working class through cheap salaries, savage usury, and overpriced (!) slums, forcing women into prostitution, men into cemetery desecration and sewage exploration. There is no redemption at any point in Palliser’s world and the reader is left with the impression that the central character John Huffam (it would be hard to call him the hero of The Quincunx) is about to fall into the same spiral of debt and legal swindles as his complete family tree.  A masterpiece. (Even though I do not buy the postmodern thread.)

## genetics

Posted in Books, Kids, Travel, University life with tags , , , , , , , , , , on April 9, 2012 by xi'an

Today, I was reading in the science leaflet of Le Monde about a new magnitude in sequencing cancerous tumors (wrong link, I know…). This made me wonder whether the sequence of (hundreds of) mutations leading from a normal cell to a cancerous one could be reconstituted in the way a genealogy is. (This reminds me of another exciting genetic article I read in the Eurostar back from London on Thursday, in the Economist, about the colonization of Madagascar by 30 women from the Malay archipelago: “The island was one of the last places on Earth to be settled, receiving its earliest migrants in the middle of the first millennium AD…“)

As a double coincidence, I was reading La Recherche yesterday in the métro to Dauphine, which central theme this month is about heredity beyond genetics. (Double because this also connected with the meeting in London.) The keyword is epigenetics, namely the activation or inactivation of a gene and the hereditary transmission of this character w/o a genetic mutation. This is quite interesting as it implies the hereditability of some adopted traits, i.e. forces one to reconsider the nature versus nurture debate. (This sentence is another input due to Galton!) It also implies that a much faster rate of species differentiation due to environmental changes (than the purely genetic one) is possible, which may sound promising in the light of the fast climate changes we are currently facing. However, what I do not understand is why the journal included a paper on the consequences of epigenetics on the Darwinian theory of evolution and… intelligent design. Indeed, I do not see why the inclusion of different vectors in the hereditary process would contradict Darwin’s notion of natural selection. Or even why considering a scientific modification or replacement of the current Darwinian theory of evolution would be an issue. Charles Darwin wrote his book in 1859, prior to the start of genetics, and the immense advances made since then led to modifications and adjustments from his original views. Without involving any irrational belief in the process.