Archive for the Statistics Category

To predict and serve?

Posted in Books, pictures, Statistics with tags , , , , , , , , , , , on October 25, 2016 by xi'an

Kristian Lum and William Isaac published a paper in Significance last week [with the above title] about predictive policing systems used in the USA and presumably in other countries to predict future crimes [and therefore prevent them]. This sounds like a good idea for a science fiction plot, à la Philip K Dick [in his short story, The Minority Report], but that it is used in real life definitely sounds frightening, especially when the civil rights of the targeted individuals are impacted. (Although some politicians in different democratic countries increasingly show increasing contempt for keeping everyone’ rights equal…) I also feel terrified by the social determinism behind the very concept of predicting crime from socio-economic data (and possibly genetic characteristics in a near future, bringing us back to the dark days of physiognomy!)

“…crimes that occur in locations frequented by police are more likely to appear in the database simply because that is where the police are patrolling.”

Kristian and William examine in this paper one statistical aspect of the police forces relying on crime prediction software, namely the bias in the data exploited by the software and in the resulting policing. (While the accountability of the police actions when induced by such software is not explored, this is obviously related to the Nature editorial of last week, “Algorithm and blues“, which [in short] calls for watchdogs on AIs and decision algorithms.) When the data is gathered from police and justice records, any bias in checks, arrests, and condemnations will be reproduced in the data and hence will repeat the bias in targeting potential criminals. As aptly put by the authors, the resulting machine learning algorithm will be “predicting future policing, not future crime.” Worse, by having no reservation about over-fitting [the more predicted crimes the better], it will increase the bias in the same direction. In the Oakland drug-user example analysed in the article, the police concentrates almost uniquely on a few grid squares of the city, resulting into the above self-predicting fallacy. However, I do not see much hope in using other surveys and datasets towards eliminating this bias, as they also carry their own shortcomings. Even without biases, predicting crimes at the individual level just seems a bad idea, for statistical and ethical reasons.


Posted in Books, pictures, R, Statistics with tags , , , , on October 24, 2016 by xi'an

One approach to random number generation that had always intrigued me is Kinderman and Monahan’s (1977) ratio-of-uniform method. The method is based on the result that the uniform distribution on the set A of (u,v)’s in R⁺xX such that


induces the distribution with density proportional to ƒ on V/U. Hence the name. The proof is straightforward and the result can be seen as a consequence of the fundamental lemma of simulation, namely that simulating from the uniform distribution on the set B of (w,x)’s in R⁺xX such that


induces the marginal distribution with density proportional to ƒ on X. There is no mathematical issue with this result, but I have difficulties with picturing the construction of efficient random number generators based on this principle.

ratouniI thus took the opportunity of the second season of [the Warwick reading group on] Non-uniform random variate generation to look anew at this approach. (Note that the book is freely available on Luc Devroye’s website.) The first thing I considered is the shape of the set A. Which has nothing intuitive about it! Luc then mentions (p.195) that the boundary of A is given by


which then leads to bounding both ƒ and x→x²ƒ(x) to create a box around A and an accept-reject strategy, but I have trouble with this result without making further assumptions about ƒ… Using a two component normal mixture as a benchmark, I found bounds on u(.) and v(.) and simulated a large number of points within the box to end up with the above graph that indeed the accepted (u,v)’s were within this boundary. And the same holds with a more ambitious mixture:


positions in French universities [deadline]

Posted in Kids, Statistics, University life with tags , , , , , , , on October 21, 2016 by xi'an

Université Paris-DauphineFor ‘Og’s readers interested in lecturer or professor positions in French universities next academic year, including a lecturer position at Paris-Dauphine in applied and computational statistics!, you need to apply for a qualification label by a national committee which strict deadline is next Tuesday, October 25, at 4pm (Paris/CET time).  (The whole procedure is exclusively in French!)

Suffrage Science awards in maths and computing

Posted in pictures, Statistics, University life with tags , , , , , , , , , , , on October 21, 2016 by xi'an

On October 11, at Bletchley Park, the Suffrage Science awards in mathematics and computer sciences were awarded for the first time to 12 senior female researchers. Among whom three statisticians, Professor Christl Donnelly from Imperial College London, my colleague at Warwick, Jane Hutton, and my friend and co-author, Sylvia Richardson, from MRC, Cambridge University. This initiative was started by the Medical Research Council in 2011 by Suffrage Science awards for life sciences, followed in 2013 by one for engineering and physics, and this year for maths and computing. The name of the award aims to connect with the Suffragette movement of the late 19th and early 20th Centuries, which were particularly active in Britain. One peculiar aspect of this award is that the recipients are given pieces of jewellery, created for each field, pieces that they will themselves give two years later to a new recipient of their choice, and so on in an infinite regress! (Which suggests a related puzzle, namely to figure out how many years it should take until all female scientists have received the award. But since the number increases as the square of the number of years, this is not going to happen unless the field proves particularly hostile to women scientists!) This jewellery award also relates to the history of the Suffragette movement since the WPSU commissioned their own jewellery awards. A clever additional touch was that the awards were delivered on Ada Lovelace Day, October 11.

David Cox gets the first International Prize in Statistics

Posted in pictures, Statistics, University life with tags , , , , , , , , on October 20, 2016 by xi'an

Just received an email from the IMS that Sir David Cox (Nuffield College, Oxford) has been awarded the International Prize in Statistics. As discussed earlier on the ‘Og, this prize is intended to act as the equivalent of a Nobel prize for statistics. While I still have reservations about the concept. I have none whatsoever about the nomination as David would have been my suggestion from the start. Congratulations to him for the Prize and more significantly for his massive contributions to statistics, with foundational, methodological and societal impacts! [As Peter Diggle, President of the Royal Statistical Society just pointed out, it is quite fitting that it happens on European Statistics day!]

a grim knight [cont’d]

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , on October 20, 2016 by xi'an

As discussed in the previous entry, there are two interpretations to this question from The Riddler:

“…how long is the longest path a knight can travel on a standard 8-by-8 board without letting the path intersect itself?”

riddlerechckas to what constitutes a path. As a (terrible) chess player, I would opt for the version on the previous post, the knight moving two steps in one direction and one in the other (or vice-versa), thus occupying three squares on the board. But one can consider instead the graph of the moves of that knight, as in the above picture and as in The Riddler. And since I could not let the problem go I also wrote an R code (as clumsy as the previous one!) to explore at random (with some minimal degree of annealing) the maximum length of a self-avoiding knight canter. riddlerechkThe first maximal length I found this way is 32, although I also came by hand to a spiralling solution with a length of 33.

riddlerechckRunning the R code longer over the weekend however led to a path of length 34, while the exact solution to the riddle is 35, as provided by the Riddler (and earlier in many forms, including Martin Gardner’s and Donald Knuth’s).

[An unexpected side-effect of this riddle was ending up watching half of Bergman’s Seventh Seal in Swedish…]

an attempt at EP-ABC from scratch, nothing more… [except for a few bugs]

Posted in R, Statistics, University life with tags , , , , , , on October 19, 2016 by xi'an

Following a request from one of the reviewers of our chapter Likelihood-free model choice, I tried to run EP-ABC on a toy problem and to compare it with the outcome of a random forest ABC. Literally starting from scratch, namely from the description found in Simon and Nicolas’ JASA paper.  To run my test, I chose as my modelled data an exponential Exp(λ) sample of size 20, with a Gaussian N(0,1) prior on the log parameter (here replicated 100 times):

#prior values

Then I applied the formulas found in the paper for approximating the evidence, first defining the log normalising constant for an unnormalised Gaussian density as on the top of p.318


[which exhibited a typo in the paper, as Simon Barthelmé figured out after emails exchanges that the right formulation was the inverse]


and iterating the site updates as described in Sections 2.2, 3.1 and Algorithms 1 and 3:

 #global and ith natural parameters for Gaussian approximations
 #constants Ci in (2.6)
 for (t in 1:Nep){
  for (i in sample(1:n)){
  #site i update
   Qm=Q-Qi[i] #m for minus i
   #ABC sample
#update by formula (3.3)
  if (Mh>1e3){
#enough proposals accepted
#update of Ci as in formula (2.7)
#with correction bottom of p.319
  normz[i]=log(Mh/M/2/eps)- Psi(er,Q)+Psi(erm,Qm)
#approximation of the log evidence as on p.318
return(sum(normz)+Psi(er,Q)-Psi(er0,Q0)) }

except that I made an R beginner’s mistake (!) when calling the normal simulation to be


to be compared with the genuine evidence under a conjugate Exp(1) prior [values of the evidence should differ but should not differ that much]


After correcting for those bugs and too small sample sizes, thanks to Simon kind-hearted help!, running this code results in minor discrepancies between both quantities:

M=1e4 #number of ABC samples
propep=.1 #tolerance factor
for (t in 1:100){
 #tolerance scaled by data

as shown by the comparison below between the evidence and the EP-ABC approximations to the evidence, called epabence (the dashed line is the diagonal):

epabcObviously, this short (if lengthy at my scale) experiment is not intended to conclude that EP-ABC work or does not work. (Simon also provided additional comparison with the genuine evidence, that is under the right prior, and with the zero Monte-Carlo version of the EP-ABC evidence, achieving a high concentration over the diagonal.) I just conclude that the method does require a certain amount of calibration to become stable. Calibrations steps that are clearly spelled out in the paper (adaptive choice of M, stabilisation by quasi-random samples, stabilisation by stochastic optimisation steps and importance sampling), but also imply that EP-ABC is also “far from routine use because it make take days to tune on real problems” (pp.315-316). Thinking about the reasons for such discrepancy over the past days, one possible explanation I see for the impact of the calibration parameters is that the validation of EP if any occurs at a numerical rather than Monte Carlo level, hence that the Monte Carlo error must be really small to avoid interfering with the numerical aspects.