GorobeisGoProgram/Introduction

Very random, unorganized thoughts

Feel free to add comments to these by Gorobei.

Well, let's take a quick look at the numbers (ignoring superko, snapbacks, etc)...

Number of points: 361
Number of possible board positions: approximately 3^361
Number of possible games: approximately 361! (361 factorial)

These are so big as to make chess look like tic-tac-toe. Not too promising for a brute force approach. Go is PSpace hard (Lichtenstein,) so naive simplifications don't seem promising. Analysing games at various skill levels seems to bear this out: expert games seem to have large undecided battles, novice games are a series of fights to the death. The "lightness" of pro games seems to automatically drive towards high complexity. In a handicap game, Black must try to contain the interactions of unsettled groups, and thus reduce the great complexity. In short, there is no way to reduce the big numbers into the brute force range (in 2001, maybe by 2050, though)

How about local patterns (e.g. a cache of 5x5 or 7x7 areas with pre-computed moves?) 5x5 has a space of 3^25 (assuming white, black, empty.) That's too big, but even it is wasn't, games often have key plays that wouldn't be seen even by a 9x9 pattern cache.

Smart patterns (ala GnuGo) look somewhat more promising, at least for local, tactical struggles.

Expert code to analyse life and death, capturing races, ko-fights, probable score, etc, seems useful.

Classic pruned search seems doomed to fail: the time between move and outcome is too high. This also seems to doom pure bucket-brigade ANN reward-assigners.

[edit]

Can we write a decent learning Go program?

We could try:

a pure self-learner - e.g. genetic algorithms.
a move predictor that we run on a test set of expert games, e.g. an autoassociator
a "society of mind" type system with competing agents
a hierarchical system with each level planning over decreasing size areas on the board

The checkers experience makes me think self-learning is a dead-end by itself: experts teach novices, novices teach bad habits.

Move prediction seems interesting if only to gather statistics about the problem space. Given an expert's move, how often does the local space (e.g. 5x5) predict the move based on other pro games?

Some approaches seem somewhat promising: build a bunch of experts, add some arbitrators, and hope it all turns out alright. Perhaps more flexible than a "big main loop" type approach. ''Q: how much GnuGo effort is spent tweaking the main loop versus tweaking the subroutines?''

dnerra: I could answer this question if I would understand it. What do you mean by the main loop? If you mean a global lookahead (global alpha-beta reader), the answer is -- none at all! GNU Go does not yet have a global reader, and introducing one is certainly an important, but long-term goal.

Gorobei: Please excuse my stream of consciousness writing. By "main loop," I mean the bit of program that decides where to spend computing resources and chooses the best move amongst the competing candidates based on the current board, who is ahead, the complexity of the position, and the time remaining. In GNU Go's case, I guess, this would be the code that would eventually be replaced by a strategic alpha-beta reader. My half-formed thought was that if most defects in GNU Go's play can be fixed by tweaking patterns, life and death analyzers, etc, then a bunch of local experts might play a decent game. If, on the other hand, defects often require tweaking or rewriting the code that uses the local experts, the flat model of local agent at the very least would need to be augmented with higher-order, more strategic agents.

dnerra: That is a very good question. I think more time is spent on what you call the "local experts" (tactical "Can I capture this string?"-reading, the pattern based life-and-death-reading, the influence and territory evaluation, a connection reader (not yet used much, but definitely important in the future)). But that does not mean that weaknesses are mostly due to them.

The "main loop" (make_dragons(), value_moves.c, move_reasons.c) has quite a lot of heuristics, which work good for a lot of cases, but often very bad for others. I think there are various reasons why tuning the main loop is not as attractive: If you do a 5% reading speedup, you know you have done s.th. useful. If you change some of the top level heuristics, it will improve play in some situations, but it will break other cases. So you never can be sure you have really improved the program. Also, if you assume that the rest of the engine is changing, than any change at the top level might become nonsense after a while, whereas an improvement of a "local expert" will not cease to be useful.

When I first got into GNU Go, I started doing some top level stuff; now I don't like it so much anymore. My personal impression is that 1. GNU Go could definitely still be improved by tweaks in the main loop, but 2. that you can spent as much time as you want on those heuristics, you will always get cases where they fail really badly. Others involved with GNU Go might disagree here (and I may well be wrong), but I think those heuristics will have to be replaced at some point by a global reader.

Hierarchical systems always seem seductive: my local experts fan out across the board, report to their local captains, captains report to majors, etc. Status reports flow up, and orders flow down. Each individual makes good decisions as to defense vs attack, etc, and produces nice plans based on his resources. It's a nice idea, and appealing to us that read war books. One problem is that war accounts are usually written by people who aren't dead.

[edit]

Would it help to have a SETI@home size pool of compute power?

In theory, a good screensaver client could be successful. Go boards make for nice looking screensavers (better than FFT trash or big molecules.) Go programs would have better competitive appeal ("my local program is playing at 74%" beats "I've spent 3000 hours of compute time and found 0 aliens.")

Jared Beck: Let us consider the feasability of creating a database of whole-board (global) moves. It is not feasible to consider every possible game. We must then arrive at a set of games which are worth considering. As stated at the top of this page, there are approximately 361! legal games. (Slightly less than 361! because you can't play suicide moves) 361! is apx. 1.438x10^768, apx. the total number of legal games. However, the number of "good games" is a much harder problem. Using a library of opening moves reduces the number of games, and brings us closer to the set of "good games". If, for the sake of argument, we use a ten move deep fuseki library, containing N fuseki patterns, the total number of games is now (351!)N or (4.3x10^742)N. This number is still far too big for even distributed computing. Other methods of reducing the total number of games considered would be necessary. If a library of good shape were used, the total number of games could be greatly reduced. However, in this approach, the flaws of the shape library are carried over into the resultant global move database. (Just like the flaws in the fuseki library would carry over) 7/24/2003

[edit]

Should a program care about how a position was reached?

At some level, a go player should be able to look at a board, and decide what the objectively best move is. Q: in pro games, how often is the move played close (e.g. with 3 points) to the previous move? How often would another expert pick the same move, or area, only seeing the board, not the previous move?

If the current power of computers is insufficient to find the best move, should we try to "use the opponent's computation" by also considering his previous move?

My gut feeling is No: it is better to build knowledge about the board, caching as much as possible from turn to turn, so that as areas become settled, we no longer devote compute power to them. Easy to say, but hard to do.

[edit]

Concepts that must be represented in a Go program

The physical stuff: board, point, stone, player
The rules: territory (at least for scoring,) ko, time
Stuff needed for analysis/play: this is, of course, the hard bit. Go has a large vocabulary of terms, and it might be instructive to try to categorize them into conceptual groups:
- Shape: BambooJoint, Geta, DoubleHane, Hane, IkkenTobi, Tsuke, Keima, Kosumi, Ogeima, Ponnuki, TripodGroup
- Tactical: Atari, Dame, Damezumari, DoubleKo, DoubleSente, Hamete, Kikashi, TenThousandYearKo, MeAriMeNashi, Motare, Nakade, Nobi, Nozoki, Descent, Semeai, Snapback, Suberi, Tesuji, ThreeStepKo
- Strategic: Aji, Thickness/Atsui, Dango, Furikawari, Hasami, Hiraki, Honte, Kakari, Keshi, Korigatachi, Miai, Oba, Sabaki, Shimari, Shinogi, SqueezeTesuji, Tenuki, Territory, Tobi,
- Rule: ByoYomi, Fujite, Jigo, Ko, Komi, MirrorGo, Nigiri, Prisoners, TripleKo
- Abstract: Chuban, Fuseki, Gote, Joseki, Katteyomi, Kiai, Moyo, Seki, Sente, Shicho, SmallMove, Tedomari, Tewari, ThankYouMove, Tsumego, Yose, probe
- Physical: Goban, Hoshi, Insei, Komoku, Sansan, Shodan, Takamoku, Tengen

These are somewhat arbitrary, especially because many concepts span multiple categories. E.g. I tried to keep concepts about local situations in tactical; more global ideas (that still suggest play) in strategic; and ideas that do not suggest a line of play in abstract. However, tactical is not purely local, nor is strategy always global. A cynic might say that tactical is what you do when you know what to do, strategic is what you do when you don't.

It is interesting that "eye", "snake", "dragon", "unsettledness" etc, do not appear in the word list. Is this because they are intuitive, common concepts (i.e. there is no single word to describe them,) or because they are really less important ideas?

[edit]

Positional Evaluation

Initially, it seems obvious that position evaluation (is board A or board B better) is a requirement for a game playing program. I would almost not consider a program good unless it could perform relative analysis. However...

Tic-tac-toe (even in high-dimensional spaces) does not need deep position evaluation to play well (because the game is simple.)
Chess uses positional evaluation because
- deep alpha-beta, etc, searching needs evaluation at the leaves
- positional evaluation is a good predictor of game outcome (e.g. up a bishop will decide the game.)

Even though it is not clear that a computer Go program needs an explicit evaluation function, it seems useful because:

we can ask if it thinks it's winning or losing in a game
we can ask it which of two moves it prefers

In Chess, the natural unit of goodness seems to be a pawn. Note that this is a bizarre measure of goodness: surely the value of a pawn changes during the course of a game. In Go, it seems the natural unit is one point of territory. This is a more natural measure because it relates directly to the final score. Both games have a trinary outcome (win, lose, draw,) yet players have a continuum on which they judge how well they are doing.

If positional evaluation was everything, we would expect an expert playing a handicapped novice to resign aftrer the first move (i.e. the novice has more territory.) Because this obviously does not happen, it must be that a second order statistic is involved. Call it unsettledness, opportunity, volatility, choice, etc. The game of backgammon makes this concept explicit: a player may "double" the stakes requiring the other player to resign or accept double odds. If he accepts, the right to double is passed to him. Wall Street geeks will recognise doubling to be an exotic option.

(Note: choice and sente need elucidation)

Positional evaluation seems to depend on at least the following:

probable territory
unsettledness
choice
sente

[edit]

Stuff not entered yet:

Board evaluation: territory and moves (as in amount of sente)
Influence maps and local propagation through patterns
Other map types (threat, area, life)
Continuous vs Discrete - even Death is continuous
Possible worlds, Territory, and Complexity

HolIgor: I wanted to create a go program too. I have written it. It makes moves and does not allow you to make an illegal move. I cannot count territory though, so the rules were quasi Chinese (it counts stones and one point eyes).

The next thing to do was to determine an algorithm. The random was the simplest. I used it while debugging the program.

Then I made several simple algorithms and matched them.

I had a champion very soon. I call it a crawler. The algorithm was simple:

make a move that produces a group with the biggest number of liberties.
don't play Damezumari moves.
don't punch your one point eyes.

Guess how it played. Slowly it made line accross the board. One large group. If there was an obstacle on the way it turned. Of course it was slow, but everything was connected and even for a human player it was not easy to kill this large but slow group.

For several months I could not write an algorithm that would beat the crawler. The problem was in connections. Algorithms left holes through which the crawler penetrated cutting everything into pieces and eventually strangling separate groups.

I have an algorithm that beats the crawler now. It has beaten IgoWin first time when I matched them, but generally IgoWin is much stronger. My program was lucky that time. It remains extremely weak. Perhaps one day I will have some new idea.

Gorobei: Crawler is a neat idea. Kinda like a young, smart kid playing Go for the first time. It's a great test opponent for Go programs: their play against it should reveal a lot about how well they understand making territory and eyes.

SifuEric: I have some ideas about how a program should play go. To me, I think it should try to mimic how a human plays. Chess programs do not play how humans play. They find the mathematically best move, either from a table or from calculation. Go is more than just winning; elegance is a major factor that can reveal the true talent of a good player. I think a Go program should have at least a model of human thought.

Here is what I have been thinking about: The computer decides a score for each player based on the current board. That score is a weighted sum of aspects of certain basic principles such as shape, influence, and territory. The weights for those things changed as the game progresses, because, for instance, shape is not as important in the endgame as it is in the rest of the game.

The computer would then pick several moves that it thinks could be its best move. It then does a search to some specified depth, going back to analyzing the board, then guessing moves.

If you have read Fluid Concepts and Creative Analogies by Douglas Hofstadter, you have an idea of how the computer analyzes features such as shape and influence and determines possible moves. Basically, good shape is stored in a network (a directed graph). Basic shapes (up to some small number, possibly 6) activate edges in the network probabilistically, and parallel agents follow the edges. Each edge leads to another shape. The missing stone or stones in the shape are guesses for the next move. Similar networks work for influence.

The real problem I had, and that I still have no solution for, is how to quantify influence. I don't even know if I really have a good grasp on exactly what it is. I think this will be hard (at first).

John F. I know all or almost all go programs measure influence, probably because (a) Zobrist did it and (b) it suits computers. But I suspect it's all a waste of time. And it's also computationally intensive. I suggest that if you measure thickness instead (atsusa, not atsumi) you'll find it's easier and a lot more useful. So you measure a group by (something like) the security of its eye shape, cutting/peeping/etc points, its size and open directions it faces in. You then have information that is of strategic as well as tactical value. And all rather easier to compute. Also, more human-like so easier to debug and develop.

The computer would (and should) be able to learn. It seems like it would be impossible to populate the network by hand. The computer would learn by reading pro games and adjusting the weights of the edges and adding new nodes when necessary. Learning from pro games works because pros make more good moves than bad ones, so it could be unsupervised. It could also examine its own games by doing a deeper search of its moves.

Just my ideas.

DJQuackQ: The problem with analyzing pro games is that the smallest difference in board configuration (and I'm not talking ladder breakers, etc here) often significantly change where the biggest play is, or even whether a specific move is really good or really bad.

Roving Eye Neural Networks

Hey, what do you think of this paper, [ext] http://nn.cs.utexas.edu/pub-view.php?PubID=145
The paper suggests the use of evolving a Roving Eye Neural Network. It can only see a part of the board at a time, it can choose to move about, explore the board more, or make a choice for the move. The results are >>
(1) The same roving eye architecture can play on different board sizes.
(2) experience gained by playing on a small board provides an advantage for further learning on a larger board. The experiments seem to give good results for 5x5 and 7x7 boards. There were no tests on board sizes above 7x7.
I am a newbie myself, and wanted your take on how well, you think, it might scale.
Because, another neuro-evolution method, SANE as mentioned here > [ext] http://nn.cs.utexas.edu/pub-view.php?PubID=93 is reported to have failed for board sizes above 7x7. I am not acquainted with SANE. ---SanChan?

GorobeisGoProgram/Introduction last edited by 117.196.132.30 on October 2, 2008 - 14:54

RecentChanges · StartingPoints · About

Edit page ·Search · Related · Page info · Latest diff