## rehabilitation of the communards

Posted in Books, Kids, pictures with tags , , , , on December 4, 2016 by xi'an

Today the French National Assembly voted the rehabilitation of all victims of the repression of the Commune de Paris (March 18—May 28, 1871), which saw between 10,000 and 20,000 Parisian men and women summarily executed by troops of the provisional government of Adolphe Thiers during the Bloody Week.

## threats on abortion rights

Posted in Kids, pictures with tags , , , , , , on December 3, 2016 by xi'an

After a surrealistic debate last week between the two primary candidates about their comparative proximity to the catholic pope, and the reminder by the winner, François Fillon, that he did not consider abortion as a fundamental right, the French catholic church hierarchy has been trying to block a new law that criminalises anti-abortion websites that mimic as official websites, while providing misleading information… With the usual newspeak deviance of arguing for a freedom of speech that is not under threat.

## MAP as Bayes estimators

Posted in Books, Kids, Statistics with tags , , , , on November 30, 2016 by xi'an

Robert Bassett and Julio Deride just arXived a paper discussing the position of MAPs within Bayesian decision theory. A point I have discussed extensively on the ‘Og!

“…we provide a counterexample to the commonly accepted notion of MAP estimators as a limit of Bayes estimators having 0-1 loss.”

The authors mention The Bayesian Choice stating this property without further precautions and I completely agree to being careless in this regard! The difficulty stands with the limit of the maximisers being not necessarily the maximiser of the limit. The paper includes an example to this effect, with a prior as above,  associated with a sampling distribution that does not depend on the parameter. The sufficient conditions proposed therein are that the posterior density is almost surely proper or quasiconcave.

This is a neat mathematical characterisation that cleans this “folk theorem” about MAP estimators. And for which the authors are to be congratulated! However, I am not very excited by the limiting property, whether it holds or not, as I have difficulties conceiving the use of a sequence of losses in a mildly realistic case. I rather prefer the alternate characterisation of MAP estimators by Burger and Lucka as proper Bayes estimators under another type of loss function, albeit a rather artificial one.

## simulation by hand

Posted in Books, Kids, pictures, Statistics, Travel with tags , , , , , , , on November 28, 2016 by xi'an

A rather weird question on X validated this week was about devising a manual way to simulate (a few) normal variates. By manual I presume the author of the question means without resorting to a computer or any other business machine. Now, I do not know of any real phenomenon that is exactly and provably Normal. As analysed in a great philosophy of science paper by Aidan Lyon, the standard explanations for a real phenomenon to be Normal are almost invariably false, even those invoking the Central Limit Theorem. Hence I cannot think of a mechanical device that would directly return Normal generations from a Normal distribution with known parameters. However, since it is possible to simulate by hand Uniform U(0,1) variates [up to a given precision] using a chronometre or a wheel, calls to versions of the Box-Müller algorithm that do not rely on logarithmic or trigonometric functions are feasible, for instance by generating two Exponential variates, x and y, until 2y>(1-x)², x being the output. And generating Exponential variates is easy provided a radioactive material with known half-life is available, along with a Geiger counter. Or, if not, by calling von Neumann’s exponential generator. As detailed in Devroye’s simulation book.

After proposing this solution, I received a comment from the author of the question towards a simpler solution based, e.g., on the Central Limit Theorem. Presumably for simple iid random variables such as coin tosses or dice experiments. While I used the CLT for simulating Normal variables in my very early days [just after programming on punched cards!], I do not think this is a very good or efficient method, as the tails grow very slowly to normality. By comparison, using the same amount of coin tosses to create a sufficient number of binary digits of a Uniform variate produces a computer-precision exact Uniform variate, which can be exploited in Box-Müller-like algorithms to return exact Normal variates… Even by hand if necessary. [For some reason, this question attracted a lot of traffic and an encyclopaedic answer on X validated, despite being borderline to the point of being proposed for closure.]

## orange day

Posted in Kids, pictures with tags , , , on November 25, 2016 by xi'an

A day when I can wear orange and take time justifying it. A day when, among many other reminders of the urgency, a French woman who killed her abusive husband had to stay in jail because the court considers she does not exhibit enough culpability!, when two Moroccan teenager girls are on trial for kissing, when India’s minister for women states that her country’s issue with rape is “exaggerated”, when Turkey’s government just barely withdrew a bill that would have granted amnesty to some men convicted of child sex assault if they married their victims….

## sampling by exhaustion

Posted in Books, Kids, R, Statistics with tags , , , , on November 25, 2016 by xi'an

The riddle set by The Riddler of last week sums up as follows:

Within a population of size N, each individual in the population independently selects another individual. All individuals selected at least once are removed and the process iterates until one or zero individual is left. What is the probability that there is zero individual left?

While I cannot see a clean analytical solution to this problem, it reminds me of an enveloppe-versus-letter (matches) problem I saw in graduate school. Indeed, the expected number of removed (or selected) individuals is given by

$N\left\{1-\frac{N-2}{N-1}\right\}^{N-1}$

which is equivalent to (1-e⁻¹)N for N large, meaning that the population decreases by an average factor of e⁻¹ at each round. And that it takes on average approximately log(N) iterations to reach a single individual. A simulation of the first probabilities of ending with one individual led me to the above curve, which wiggles in an almost periodic way around the probability ½, equal to the average of those probabilities. Using the R code

rad=function(N){#next population size
ut=sample(rep(2:N,2),1)
for (i in 2:N)#sampling
ut=c(ut,sample(rep((1:N)[-i],2),1))
return(N-length(unique(ut))}
sal=rep(0,m);sal[1]=1
for (N in 3:M){
prop=0;
for (t in 1:T){#one single step
if (i>0) prop=prop+sal[i]}
sal[N]=prop/T}


which exploits the previously computed probabilities. The variability is most interesting if unexpected, but looking back at Feller‘s sections and exercises on the classical occupancy problem, I could not find a connection with this problem. If it exists. Still, if N is large enough, the exclusion of one index from the selection becomes negligible and the probability of moving from n to m individuals should be approximately [Feller, eqn (2.4), p.102]

$p_n(m)={n\choose m}\sum_{v=}^{n-m} (-1)^v {n-m\choose v} \left(1-\frac{m+v}{n}\right)^n$

This formula approximates quite well the exact probability, but as in a previous post about the birthday problem, it proves quite delicate to compute. As already noticed by Feller.

## Monty Python generator

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , on November 23, 2016 by xi'an

By some piece of luck I came across a paper by the late George Marsaglia, genial contributor to the field of simulation, and Wai Wan Tang, entitled The Monty Python method for generating random variables. As shown by the below illustration, the concept is to flip the piece H outside the rectangle back inside the rectangle, exploiting the remaining area above the density. The fantastic part is actually that “since areas of the rectangle add to 1, the slim in-between area is exactly the tail area”! So the tiny bit between G and the flipped H is the remaining tail.In the case of a Gamma Ga(a,1) variate, the authors express this variate as the transform of another variate with a nearly symmetry density, on which the Monty Python method applies. The transform is

$q(x)=(a-1/3)(1 + x/\sqrt{16a})^3$

with -√16a<x. The second nice trick is that the density of x is provided for free by the Gamma Ga(a,1) density and the transform, thanks to the change of variable formula. One lingering question is obviously how to handle the tail part. This is handled separately in the paper, with a rather involved algorithm, but since the area of the tail is tiny, a mere 1.2% in the case of the Gaussian density, this instance occurs rarely. Very clever if highly specialised! (The case of a<1 has to be processed by the indirect of multiplying a Ga(a+1,1) by a uniform variate to the power 1/a.)

I also found out that there exists a Monte Python software, which is an unrelated Monte Carlo code in python [hence the name] for cosmological inference. Including nested sampling, unsurprisingly.