machine learning and the future of realism

Giles and Cliff Hooker arXived a paper last week with this intriguing title. (Giles Hooker is an associate professor of statistics and biology at Cornell U, with an interesting blog on the notion of models, while Cliff Hooker is a professor of philosophy at Newcastle U, Australia.)

“Our conclusion is that simplicity is too complex”

The debate in this short paper is whether or not machine learning relates to a model. Or is it concerned with sheer (“naked”) prediction? And then does it pertain to science any longer?! While it sounds obvious at first, defining why science is more than prediction of effects given causes is much less obvious, although prediction sounds more pragmatic and engineer-like than scientific. (Furthermore, prediction has a somewhat negative flavour in French, being used as a synonym to divination and opposed to prévision.) In more philosophical terms, prediction offers no ontological feature. As for a machine learning structure like a neural network being scientific or a-scientific, its black box nature makes it much more the later than the former, in that it brings no explanation for the connection between input and output, between regressed and regressors. It further lacks the potential for universality of scientific models. For instance, as mentioned in the paper, Newton’s law of gravitation applies to any pair of weighted bodies, while a neural network built on a series of observations could not be assessed or guaranteed outside the domain where those observations are taken. Plus, would miss the simple square law established by Newton. Most fascinating questions, undoubtedly! Putting the stress on models from a totally different perspective from last week at the RSS.

As for machine learning being a challenge to realism, I am none the wiser after reading the paper. Utilising machine learning tools to produce predictions of causes given effects does not seem to modify the structure of the World and very little our understanding of it, since they do not bring explanation per se. What would lead to anti-realism is the adoption of those tools as substitutes for scientific theories and models.

3 Responses to “machine learning and the future of realism”

  1. In quantitative Genetics, parametric models help understanding and understanding help prediction models. Crude machine learning applied to genetics data rarely performs better than parametric models based on cumulated theory.

    The other point is that if only prediction matters, then “there is no rooom for discussion” in a literal sense, because we can not talk to each other because there are not concepts to discuss on.

    For instance the divergence models (out-of-africa, etc) only make sense if you can interpret them and tell a tale.

  2. Rather than prediction as such I’d prefer to say something like machine learning places more emphasis on observables and input-output relationships than inference for parameters and internal structure. Not a big terminology gap I suppose

    • The authors themselves raise what seems to me to be the biggest challenge to the ‘naked prediciton’ view: qualitative as opposed to quantitative behaviour. E.g. they mention bifurcations (as in dynamical systems).

      But this is arguably captured in machine learning under the umbrella of stability under out-of-sample prediction. Roughly, overfitting -> unstable model -> a sort of bifurcation is observed: what should be small changes (switching from training to test dataset) gives a large change in behaviour (predictive error).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s