Archive for data challenge

no dichotomy between efficiency and interpretability

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , on December 18, 2019 by xi'an

“…there are actually a lot of applications where people do not try to construct an interpretable model, because they might believe that for a complex data set, an interpretable model could not possibly be as accurate as a black box. Or perhaps they want to preserve the model as proprietary.”

One article I found quite interesting in the second issue of HDSR is “Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition” by Cynthia Rudin and Joanna Radin, which describes the setting of a NeurIPS competition last year, the Explainable Machine Learning Challenge, of which I was blissfully unaware. The goal was to construct an operational black box predictor fpr credit scoring and turn it into something interpretable. The authors explain how they built instead a white box predictor (my terms!), namely a linear model, which could not be improved more than marginally by a black box algorithm. (It appears from the references that these authors have a record of analysing black-box models in various setting and demonstrating that they do not always bring more efficiency than interpretable versions.) While this is but one example and even though the authors did not win the challenge (I am unclear why as I did not check the background story, writing on the plane to pre-NeuriPS 2019).

I find this column quite refreshing and worth disseminating, as it challenges the current creed that intractable functions with hundreds of parameters will always do better, if only because they are calibrated within the box and have eventually difficulties to fight over-fitting within (and hence under-fitting outside). This is also a difficulty with common statistical models, but having the ability to construct error evaluations that show how quickly the prediction efficiency deteriorates may prove the more structured and more sparsely parameterised models the winner (of real world competitions).

data challenge in Sardinia

Posted in Books, Kids, R, Statistics, Travel, University life with tags , , , , , , on June 9, 2016 by xi'an

In what I hope is the first occurrence of a new part of ISBA conferences, is launching a data challenge at ISBA 2016 next week. The prize being a trip to take part in their monthly hackathon. In Amsterdam. It would be terrific if our Bayesian conferences, including BayesComp, could gather enough data and sponsors to host an hackathon on site! (I was tempted to hold such a challenge for our estimating constants workshop last month, but Iain Murray pointed out to me the obvious difficulties of organising it from scratch…) Details will be available during the conference.

RSS statistical analytics challenge 2014

Posted in Kids, R, Statistics, University life, Wines with tags , , , , on May 2, 2014 by xi'an

RSS_Challenge_2014Great news! The RSS is setting a data analysis challenge this year, sponsored by the Young Statisticians Section and Research Section of the Royal Statistical Society: Details are available on the wordpress website of the Challenge. Registration is open and the Challenge goes live on Tuesday 6 May 2014 for an exciting 6 weeks competition. (A wee bit of an unfortunate timing for those of us considering submitting a paper to NIPS!) Truly terrific, I have been looking for this kind of event to happen for many years (without finding the momentum to set it rolling…)  and hope it will generate a lot of exciting activity and replicas in other societies.