Playing Monte Carlo Programs

    Keywords: Software

This page is intended for people to describe their experiences in playing the latest strong Go programs (currently all using Monte Carlo methods)

As of mid 2008 these are Crazystone (KGS 1k), Leelabot (KGS 1k) and Mogo (KGS 2k)

Opening

They play weird opening moves, but ones not easy to refute.

Harleqin: Unexpected moves are not necessarily bad moves. I think that once this kind of programs, which is not influenced by any predetermined "knowledge", becomes really strong, they will have quite an impact on opening theory, comparable to shinfuseki.

fractic: I disagree. A Monte Carlo algorithm produces less reliable moves the longer the game can go on. Combine this with the fact that a win is a win for a MC program and I think you will see the algorithm play mostly slack moves when ahead in the fuseki.

Harleqin: You seem not to have understood what I'm saying. It is quite clear that the opening must be their weakest spot, but I was talking about the time when they do get strong enough at it.

fractic: For the reasons I posted above, I don't believe a MC algorithm will ever be strong at the opening. At least not until the computers we have are a lot more powerful then those we have now.

Harleqin: I will not assume that you believe in a hard maximum of ever in the future available computing power, so it seems that you are not in disagreement but instead refuse to follow my thought. Given infinite computing power (or infinite time), a Monte Carlo algorithm's play converges to perfect play. Lack of infinite computing power affects the part of the game furthest from the end most, of course, but the question is not if but when enough computing power will be available so that even their opening skill gets in the range of professional play. And when that time comes, it will be very interesting how they play, because they do not use heuristics like we humans do.

fractic: I do believe that thanks to simple material limits it won't ever be possible to completely solve go. I don't think the amount of memory needed can be stored in any medium using bits.

Tapir: Given infinite computing power the game of Go will be solved. The use of Monte Carlo methods imply that this is not the case. So, it is not clear at all, that Monte Carlo program's play will converge to perfect play and not e.g. 3d, 4d or even 5d play.

Harleqin: Given infinite computing power, a Monte Carlo algorithm converges to a solution of Go. Explanation: A Monte Carlo algorithm works by taking a random (more or less, see UCT) subset of all branches of the game tree and calculating a winning chance from it. When computing power approaches infinity, the size of this subset approaches the size of the whole tree. So, the infinity case consists of calculating the winning "chances" from the whole game tree, and this is exactly a complete solution of Go.

symplicity: We need to make a distinction here. A vanilla Monte-Carlo algorithm will NOT converge to optimal play in general (such as an algorithm that plays simulations only from the root and takes the move with the highest win proportion). However, a Monte-Carlo tree search WILL converge to optimal play, such as the UCT algorithm and most of its variants. Since all of the current strong programs are Monte-Carlo tree searchers, with enough computing power, we will get perfect play. (Small caveat - some of the current programs might using certain techniques such as heuristic hard pruning, that might destroy the theoretical guarantee of convergence, despite empirically improving play at current levels. But minus these issues, MCTS is certainly capable of optimal play given enough time/space.) In any case, I too would be interested to see what kinds of opening moves and "joseki" would be discovered by such programs if given the processing power to perform at pro-level.

fractic: A UCT algorithm with infinite computing power is effectively the hand of god. But this wouldn't help human fuseki at all because humans aren't able to remember all of that.

Harleqin: We don't need to remember everything in order to profit from learning from a stronger player.

If UCT algorithms reach a level comparable to the best human beings that would be the time where the fuseki as played by the algorithm is the most interesting.

This is exactly what I'm talking about.

However at that level it would surely still be slack play when ahead.

Your definition of slack play seems to be "improving winning chances while neglecting high score", and I think that that is faulty with respect to good strategy.

I'm not saying that you should allways aim for the maximum possible score. But look at MoGo's endgame. It makes clearly bad moves because it doesn't care about the score. I think this is also true in the fuseki. The only difference is that in a fuseki it's a lot harder to tell if a move is slack or not. And while a computer can afford a few slack plays because when it gets close he won't make them anymore, this is not a quality humans posses. It shouldn't be in the interest of the player to let the game get close in the first place.

Bill: To interject into this discussion, I have been surfing computer chess pages recently, and it seems that, although the level of play in computer chess is very high, they still need opening books. Likewise, we cannot say that a strong computer go program would play good opening moves without incorporating opening knowledge.

Harleqin: Not at the moment, but that's not what this discussion was about. I tried MoGo in an even game today to have a look at its opening. I was a bit disappointed that its first moves where always 4-4 even with the opening database disabled; I think that's a bug. The rest of the opening moves were pretty random, it doesn't seem to have any grasp of the bigger picture (though my computer is admittedly not very powerful). I think that it profits very much from high handicap, much more than a human player whose weaknesses are more evenly distributed over the course of the game.

fractic: I agree with this. Since the longer the game goes on the stronger MoGo becomes. A handicap really helps MoGo to be ahead when its real strength shows.

Middle Game

They are strong at eye-stealing, squeezes, and disconnecting.

Weak at ko fights, life and death, and dealing the 'killer blow'

Jonas: Weak at semeai, too.

Endgame

I think they are good at the endgame, but I've never got into a close endgame that I won - the bot either plays a very conservative endgame which is enough to ensure a win, or resigns before a close endgame could begin.

Scoring

I have noticed that they never lose a scored game. This is probably because they don't differentiate between losses by half a point, and any other margin. So if all examined moves lead to a loss, they start playing really badly, and then resign. Wouldn't it improve their play if, when all moves lead to a loss, they started playing to try to minimize the amount by which they loose?

Harleqin: It has been found that the results are significantly better when they strive for the best winning chances, not for the highest score. This is also quite logical since winning is the goal of the game, not having a high score. That having been said, yes, it might be sensible to try and keep a low losing margin, waiting for a mistake to exploit. However, this means that the Monte Carlo tree has to calculate and keep additional information (score), and you would have to find a way for the UCT algorithm to work with a weighted combination of the two goals based on the current winning probability (which changes in each iteration, too...). So, the computing power needed would be significantly higher and it is not clear at all that it wouldn't be more worthwhile to use it for more playouts instead.

Pawel Koziol?: One possible solution to that problem would be to let the Monte Carlo player pretend to have a sizeable komi, allowing it to win, say, in 25% of the simulations. Of course, it would require some statistical tests, but the possible algorithm would be: if winning chances fall below 25% (or some empirical value that works best) add 5 point bonus (again, or some tested value) to make the program believe that it has some chances.

General impression (Leela)

Tapir: I don't know whether this is appropriate on this page. After looking at some of Leelabots game records I got the impression it is especially good in exposing some major weaknesses of the single digit kyus it wins against - lack of whole board thinking, passivity/following Leela around, not counting, weak at invading / breaking up frameworks...

So be bold: Put your observations into the "strengths" section. I've also observed that against unprepared kyu players, Leela performs astonishingly well. --RolandIllig

Tapir: That is, the lack of whole board thinking of its opponents give the impression that Leela shows whole board thinking in comparison...


Playing Monte Carlo Programs last edited by 217.65.192.68 on December 26, 2013 - 09:03
RecentChanges · StartingPoints · About
Edit page ·Search · Related · Page info · Latest diff
[Welcome to Sensei's Library!]
RecentChanges
StartingPoints
About
RandomPage
Search position
Page history
Latest page diff
Partner sites:
Go Teaching Ladder
Goproblems.com
Login / Prefs
Tools
Sensei's Library