Kristian Lum and William Isaac published a paper in Significance last week [with the above title] about predictive policing systems used in the USA and presumably in other countries to predict future crimes [and therefore prevent them]. This sounds like a good idea for a science fiction plot, à la Philip K Dick [in his short story, The Minority Report], but that it is used in real life definitely sounds frightening, especially when the civil rights of the targeted individuals are impacted. (Although some politicians in different democratic countries increasingly show increasing contempt for keeping everyone’ rights equal…) I also feel terrified by the social determinism behind the very concept of predicting crime from socio-economic data (and possibly genetic characteristics in a near future, bringing us back to the dark days of physiognomy!)
“…crimes that occur in locations frequented by police are more likely to appear in the database simply because that is where the police are patrolling.”
Kristian and William examine in this paper one statistical aspect of the police forces relying on crime prediction software, namely the bias in the data exploited by the software and in the resulting policing. (While the accountability of the police actions when induced by such software is not explored, this is obviously related to the Nature editorial of last week, “Algorithm and blues“, which [in short] calls for watchdogs on AIs and decision algorithms.) When the data is gathered from police and justice records, any bias in checks, arrests, and condemnations will be reproduced in the data and hence will repeat the bias in targeting potential criminals. As aptly put by the authors, the resulting machine learning algorithm will be “predicting future policing, not future crime.” Worse, by having no reservation about over-fitting [the more predicted crimes the better], it will increase the bias in the same direction. In the Oakland drug-user example analysed in the article, the police concentrates almost uniquely on a few grid squares of the city, resulting into the above self-predicting fallacy. However, I do not see much hope in using other surveys and datasets towards eliminating this bias, as they also carry their own shortcomings. Even without biases, predicting crimes at the individual level just seems a bad idea, for statistical and ethical reasons.