go, go, go…deeper!
While visiting Warwick, last week, I came across the very issue of Nature with the highly advertised paper of David Silver and co-authors from DeepMind detailing how they designed their Go player algorithm that bested a European Go master five games in a row last September. Which is a rather unexpected and definitely brilliant feat given the state of the art! And compares (in terms of importance, if not of approach) with the victory of IBM Deep Blue over Gary Kasparov 20 years ago… (Another deep algorithm, showing that the attraction of programmers for this label has not died off over the years!)This paper is not the easiest to read (especially over breakfast), with (obviously) missing details, but I gathered interesting titbits from this cursory read. One being the reinforced learning step where the predictor is improved by being applied against earlier versions. While this can lead to overfitting, the authors used randomisation to reduce this feature. This made me wonder if a similar step could be on predictors like random forests. E.g., by weighting the trees or the probability of including a predictor or another.Another feature of major interest is their parallel use of two neural networks in the decision-making, a first one estimating a probability distribution over moves learned from millions of human Go games and a second one returning a utility or value for each possible move. The first network is used for tree exploration with Monte Carlo steps, while the second leads to the final decision.
This is a fairly good commercial argument for machine learning techniques (and for DeepMind as well), but I do not agree with the doom-sayers predicting the rise of the machines and our soon to be annihilation! (Which is the major theme of Superintelligence.) This result shows that, with enough learning data and sufficiently high optimising power and skills, it is possible to produce an excellent predictor of the set of Go moves leading to a victory. Without the brute force strategy of Deep Blue that simply explored the tree of possible games to a much more remote future than a human player could do (along with the perfect memory of a lot of games). I actually wonder if DeepMind has also designed a chess algorithm on the same principles: there is no reason why it should no work. However, this success does not predict the soon to come emergence of AI’s able to deal with vaguer and broader scopes: in that sense, automated drivers are much more of an advance (unless they start bumping into other cars and pedestrians on a regular basis!).