## box plot [xkcd]

Posted in Statistics with tags , , on March 8, 2017 by xi'an

## principal components [xkcd repost]

Posted in Kids with tags , , , , , , , on January 13, 2017 by xi'an

## 2017

Posted in Books, Kids, pictures with tags , , on January 1, 2017 by xi'an

I find this xkcd entry very much in tune with my own feelings and misgivings about 2017. I like the notion that 2016 is sending us in the future without things (and people) it would have been better to keep. Like reaching out instead of building barriers, whether about staying in the EU or uniting all Americans under one’s presidency, rather than tweeting scorn, exclusion, and righteousness. Like keeping hospitals standing and operating, instead of flattening them out, in Syria, Irak, Yemen and Afghanistan. Like preserving women’s access to contraception and abortion, instead of [old men] ruling over their body and rights. No, 2017 does not look too promising.

## going to war [a riddle]

Posted in Books, Kids, Statistics with tags , , , , , on December 16, 2016 by xi'an

On the Riddler this week, a seemingly obvious riddle:

A game consists of Alice and Bob, each with a $1 bill, receiving a U(0,1) strength each, unknown to the other, and deciding or not to bet on this strength being larger than the opponent’s. If no player bets, they both keep their$1 bill. Else, the winner leaves with both bills. Find the optimal strategy.

As often when “optimality” is mentioned, the riddle is unclear because, when looking at the problem from a decision-theoretic perspective, the loss function of each player is not defined in the question. But the St. Petersburg paradox shows the type of loss clearly matters and the utility of money is anything but linear for large values, as explained by Daniel Bernoulli in 1738 (and later analysed by Laplace in his Essai Philosophique).  Let us assume therefore that both players live in circumstances when losing or winning \$1 makes little difference, hence when the utility is linear. A loss function attached to the experiment for Alice [and a corresponding utility function for Bob] could then be a function of (a,b), the result of both Uniform draws, and of the decisions δ¹ and δ² of both players as being zero if δ¹=δ²=0 and

$L(a,b,\delta^1,\delta^2)=\begin{cases}0&\text{if }\delta^1=\delta^2=0\\\mathbb{I}(ab)&\text{else}\\\end{cases}$

Considering this loss function, Alice aims at minimising the expected loss by her choice of δ¹, equal to zero or one, expected loss that hence depends  on the unknown and simultaneous decision of Bob. If for instance Alice assumes Bob takes the decision to compete when observing an outcome b larger than a certain bound α, her decision is based on the comparison of (when B is Uniform (0,1))

$\mathbb{P}(a\alpha)-\mathbb{P}(a>B,B>\alpha)=2(1-a\vee\alpha)-(1-\alpha)$

(if δ¹=0) and of 1-2a (if δ¹=1). Comparing both expected losses leads to Alice competing (δ¹=1) when a>α/2.

However, there is no reason Alice should know the value of α when playing the (single) game and so she may think that Bob will follow the same reasoning, leading him to choosing a new bound of α/4, and, by iterating the thought process, down all the way to α=0!  So this modelling leads to always play the game, with each player having a ½ probability to win… Alternatively, Alice may set a prior on α, which leads to another bound on a for playing or not the game. Which in itself is not satisfactory either. (The published solution is following the above argument. Except for posting the maths expressions.)

## Britain re-labeled

Posted in Books, Kids, pictures with tags , , on November 20, 2016 by xi'an

## Nature snapshot [Volume 539 Number 7627]

Posted in Books, Statistics, University life with tags , , , , , , , , , , on November 15, 2016 by xi'an

A number of entries of interest [to me] in that Nature issue: from the Capuchin monkeys that break stones in a way that resembles early hominins biface tools, to the persistent association between some sounds and some meanings across numerous languages, to the use of infected mosquitoes in South America to fight Zika, to the call for more maths in psychiatry by the NIMH director, where since prevision is mentioned I presumed stats is included, to the potentially earthshaking green power revolution in Africa, to the reconstruction of the first HIV strains in North America, along with the deconstruction of the “Patient 0” myth, helped by Bayesian phylogenetic analyses, to a cover of the Open Syllabus Project, with Monte Carlo Statistical Methods arriving first [in the Monte Carlo list]….

“Observations should not converge on one model but aim to find anomalies that carry clues about the nature of dark matter, dark energy or initial conditions of the Universe. Further observations should be motivated by testing unconventional interpretations of those anomalies (such as exotic forms of dark matter or modified theories of gravity). Vast data sets may contain evidence for unusual behaviour that was unanticipated when the projects were conceived.” Avi Loeb

One editorial particularly drew my attention, Good data are not enough, by the astronomer Avi Loeb. as illustrated  by the quote above, Loeb objects to data being interpreted and even to data being collected towards the assessment of the standard model. While I agree that this model contains a lot of fudge factors like dark matter and dark energy, which apparently constitutes most of the available matter, the discussion is quite curious, in that interpreting data according to alternative theories sounds impossible and certainly beyond the reach of most PhD students [as Loeb criticises the analysis of some data in a recent thesis he evaluated].

“modern cosmology is augmented by unsubstantiated, mathematically sophisticated ideas — of the multiverse, anthropic reasoning and string theory.

The author argues to always allow for alternative interpretations of the data, which sounds fine at a primary level but again calls for the conception of such alternative models. When discrepancies are found between the standard model and the data, they can be due to errors in the measurement itself, in the measurement model, or in the theoretical model. However, they may be impossible to analyse outside the model, in the neutral way called and wished by Loeb. Designing neutral experiments sounds even less meaningful. Which is why I am fairly taken aback by the call to “a research frontier [that] should maintain at least two ways of interpreting data so that new experiments will aim to select the correct one”! Why two and not more?! And which ones?! I am not aware of fully developed alternative theories and cannot see how experiments designed under one model could produce indications about a new and incomplete model.

“Such simple, off-the-shelf remedies could help us to avoid the scientific fate of the otherwise admirable Mayan civilization.”

Hence I am bemused by the whole exercise, which deepest arguments seem to be a paper written by the author last year and an interdisciplinary centre on black holes also launched recently by the same author.

## copy code at your own peril

Posted in Books, Kids, R, Statistics, University life with tags , , , , , on November 14, 2016 by xi'an

I have come several times upon cases of scientists [I mean, real, recognised, publishing, senior scientists!] from other fields blindly copying MCMC code from a paper or website, and expecting the program to operate on their own problem… One illustration is from last week, when I read a X Validated question [from 2013] about an attempt of that kind, on a rather standard Normal posterior, but using an R code where the posterior function was not even defined. (I foolishly replied, despite having no expectation of a reply from the person asking the question.)