Archive for Go

AlphaGo [100 to] zero

Posted in Books, pictures, Statistics, Travel with tags , , , on December 12, 2017 by xi'an

While in Warwick last week, I read a few times through Nature article on AlphaGo Zero, the new DeepMind program that learned to play Go by itself, through self-learning, within a few clock days, and achieved massive superiority (100 to 0) over the earlier version of the program, which (who?!) was based on a massive data-base of human games. (A Nature paper I also read while in Warwick!) From my remote perspective, the neural network associated with AlphaGo Zero seems more straightforward that the double network of the earlier version. It is solely based on the board state and returns a probability vector p for all possible moves, as well as the probability of winning from the current position. There are still intermediary probabilities π produced by a Monte Carlo tree search, which drive the computation of a final board, the (reinforced) learning aiming at bringing p and π as close as possible, via a loss function like

(z-v)²-<π, log p>+c|θ

where z is the game winner and θ is the vector of parameters of the neural network. (Details obviously missing above!) The achievements of this new version are even more impressive than those of the earlier one (which managed to systematically beat top Go players) in that blind exploration of game moves repeated over some five million games produced a much better AI player. With a strategy at times remaining a mystery to Go players.

Incidentally a two-page paper appeared on arXiv today with the title Demystifying AlphaGo Zero, by Don, Wu, and Zhou. Which sets AlphaGo Zero as a special generative adversarial network. And invoking Wasserstein distance as solving the convergence of the network. To conclude that “it’s not [sic] surprising that AlphaGo Zero show [sic] a good convergence property”… A most perplexing inclusion in arXiv, I would say.

go, go, go…deeper!

Posted in pictures, Statistics with tags , , , , , , , , , , on February 19, 2016 by xi'an

While visiting Warwick, last week, I came across the very issue of Nature with the highly advertised paper of David Silver and co-authors from DeepMind detailing how they designed their Go player algorithm that bested a European Go master five games in a row last September. Which is a rather unexpected and definitely brilliant feat given the state of the art! And compares (in terms of importance, if not of approach) with the victory of IBM Deep Blue over Gary Kasparov 20 years ago… (Another deep algorithm, showing that the attraction of programmers for this label has not died off over the years!)This paper is not the easiest to read (especially over breakfast), with (obviously) missing details, but I gathered interesting titbits from this cursory read. One being the reinforced learning step where the predictor is improved by being applied against earlier versions. While this can lead to overfitting, the authors used randomisation to reduce this feature. This made me wonder if a similar step could be on predictors like random forests. E.g., by weighting the trees or the probability of including a predictor or another.Another feature of major interest is their parallel use of two neural networks in the decision-making, a first one estimating a probability distribution over moves learned from millions of human Go games and a second one returning a utility or value for each possible move. The first network is used for tree exploration with Monte Carlo steps, while the second leads to the final decision.

This is a fairly good commercial argument for machine learning techniques (and for DeepMind as well), but I do not agree with the doom-sayers predicting the rise of the machines and our soon to be annihilation! (Which is the major theme of Superintelligence.) This result shows that, with enough learning data and sufficiently high optimising power and skills, it is possible to produce an excellent predictor of the set of Go moves leading to a victory. Without the brute force strategy of Deep Blue that simply explored the tree of possible games to a much more remote future than a human player could do (along with the  perfect memory of a lot of games). I actually wonder if DeepMind has also designed a chess algorithm on the same principles: there is no reason why it should no work. However, this success does not predict the soon to come emergence of AI’s able to deal with vaguer and broader scopes: in that sense, automated drivers are much more of an advance (unless they start bumping into other cars and pedestrians on a regular basis!).

the most human human

Posted in Books, University life with tags , , , , , , , , , , , , , , on May 24, 2013 by xi'an

“…the story of Homo sapiens trying to stake a claim on shifting ground, flanked on both sides by beast and machine, pinned between meat and math.” (p.13)

No typo in the title, this is truly how this book by Brian Christian is called. It was kindly sent to me by my friends from BUY and I realised I could still write with my right hand when commenting on the margin. (I also found the most marvellous proof to a major theorem but the margin was just too small…)  “The most human human: What artificial intelligence teaches us about being alive” is about the Turing test, designed to test whether an unknown interlocutor is a human or a machine. And eventually doomed to fail.

“The final test, for me, was to give the most uniquely human performance I could in Brighton, to attempt a successful defense against the machines.” (p.15)

What I had not realised earlier is that there is a competition every year running this test against a few AIs and a small group of humans, the judges (blindly) giving votes for each entity and selecting as a result the most human computer. And also the most human … human! This competition is called the Loebner Prize and it was taking place in Brighton, this most English of English seaside towns, in 2008 when Brian Christian took part in it (as a human, obviously!).

“Though both [sides] have made progress, the `algorithmic’ side of the field [of computer science] has, from Turing on, completely dominated the more `statistical’ side. That is, until recently.” (p.65)

I enjoyed the book, much more for the questions it brought out than for the answers it proposed, as the latter sounded unnecessarily conflictual to me, i.e. adopting a “us vs.’em” posture and whining about humanity not fighting hard enough to keep ahead of AIs… I dislike this idea of the AIs being the ennemy and of “humanity lost” the year AIs would fool the judges. While I enjoy the sci’ fi’ literature where this antagonism is exacerbated, from Blade Runner to Hyperion, to Neuromancer, I do not extrapolate those fantasised settings to the real world. For one thing, AIs are designed by humans, so having them winning this test (or winning against chess grand-masters) is a celebration of the human spirit, not a defeat! For another thing, we are talking about a fairly limited aspect of “humanity”, namely the ability to sustain a limited discussion with a set of judges on a restricted number of topics. I would be more worried if a humanoid robot managed to fool me by chatting with me for a whole transatlantic flight. For yet another thing, I do not see how this could reflect on the human race as a whole and indicate that it is regressing in any way. At most, it shows the judges were not trying hard enough (the questions reported in The most human human were not that exciting!) and maybe the human competitors had not intended to be perceived as humans.

“Does this suggest, I wonder, that entropy may be fractal?” (p.239)

Another issue that irked me in the author’s perspective is that he trained and elaborated a complex strategy to win the prize (sorry for the mini-spoiler: in case you did  not know, Brian did finish as the most human human). I do not know if this worry fear to appear less human than an AI was genuine or if it provided a convenient canvas for writing the book around the philosophical question of what makes us human(s). But it mostly highlights the artificial nature of the test, namely that one has to think in advance on the way conversations will be conducted, rather than engage into a genuine conversation with a stranger. This deserves the least human human label, in retrospect!

“So even if you’ve never heard of [Shanon entropy] beofre, something in your head intuits [it] every time you open your mouth.” (p.232)

The book spend a large amount of text/time on the victory of Deep Blue over Gary Kasparov (or, rather, on the defeat of Kasparov against Deep Blue), bemoaning the fact as the end of a golden age. I do not see the problem (and preferred the approach of Nate Silver‘s). The design of the Deep Blue software was a monument to the human mind, the victory did not diminish Kasparov who remains one of the greatest chess players ever, and I am not aware it changed chess playing (except when some players started cheating with the help of hidden computers!). The fact that players started learning more and more chess openings was a trend much before this competition. As noted in The most human human,  checkers had to change its rules once a complete analysis of the game had led to  a status-quo in the games. And this was before the computer era. In Glasgow, Scotland, in 1863. Just to draw another comparison: I like playing Sudoku and the fact that I designed a poor R code to solve Sudokus does not prevent me from playing, while my playing sometimes leads to improving the R code. The game of go could have been mentioned as well, since it proves harder to solve by AIs. But there is no reason this should not happen in a more or less near future…

“…we are ordering appetizers and saying something about Wikipedia, something about Thomas  Bayes, something about vegetarian dining…” (p.266)

While the author produces an interesting range of arguments about language, intelligence, humanity, he missed a part about the statistical modelling of languages, apart from a very brief mention of a Markov dependence. Which would have related to the AIs perspective. The overall flow is nice but somehow meandering and lacking in substance. Esp. in the last chapters. On a minor level, I also find that there are too many quotes from Hofstadter’ Gödel, Escher and Bach, as well as references to pop culture. I was surprised to find Thomas Bayes mentioned in the above quote, as it did not appear earlier, except in a back-note.

“A girl on the stairs listen to her father / Beat up her mother” C.D. Wright,  Tours

As a side note to Andrew, there was no mention made of Alan Turing’s chess rules in the book, even though both Turing and chess were central themes. I actually wondered if a Turing test could apply to AIs playing Turing’s chess: they would have to be carried by a small enough computer so that the robot could run around the house in a reasonable time. (I do not think chess-boxing should be considered in this case!)