[Welcome to Sensei's Library!]

RecentChanges
StartingPoints
About


Referenced by
LibraryLobby

Homepages
Gorobei

 

A Novice Tries To Write A Go Program
   

Very random, unorganized thoughts

Feel free to add comments.

Well, let's take a quick look at the numbers (ignoring superko, snapbacks, etc)...

  • Number of points: 361
  • Number of possible board layouts: approx 3^361
  • Number of possible games: approx 361!

These are so big as to make chess look like tic-tac-toe. Not too promising for a brute force approach. Go is PSpace hard (Lichtenstein,) so naive simplifications don't seem promising. Analysing games at various skill levels seems to bear this out: expert games seem to have large undecided battles, novice games are a series of fights to the death. The "lightness" of pro games seems to automatically drive towards high complexity. In a handicap game, black must try to contain the interactions of unsettled groups, and thus reduce the great complexity. In short, there is no way to reduce the big numbers into the brute force range (in 2001, maybe by 2050, tho)

How about local patterns (e.g. a cache of 5x5 or 7x7 areas with pre-computed moves?) 5x5 has a space of 3^25 (assuming white, black, empty.) That's too big, but even it is wasn't, games often have key plays that wouldn't be seen even by a 9x9 pattern cache.

Smart patterns (ala GnuGo) look somewhat more promising, at least for local, tactical struggles.

Expert code to analyse life and death, capturing races, ko-fights, probable score, etc, seems useful.

Classic pruned search seems doomed to fail: the time between move and outcome is too high. This also seems to doom pure bucket-brigade ANN reward-assigners.

Can we write a decent learning Go program?

We could try:

  • a pure self-learner - e.g. genetic algorithms.
  • a move predictor that we run on a test set of expert games, e.g. an autoassociator
  • a "society of mind" type system with competing agents
  • a hierarchical system with each level planning over decreasing size areas on the board

The Checkers experience makes me think self-learning is a dead-end by itself: experts teach novices, novices teach bad habits.

Move prediction seems interesting if only to gather statistics about the problem space. Given an expert's move, how often does the local space (e.g. 5x5) predict the move based on other pro games?

SoM approaches seem somewhat promising: build a bunch of experts, add some arbitrators, and hope it all turns out alright. Perhaps more flexible than a "big main loop" type approach. Q: how much GnuGo effort is spent tweaking the main loop versus tweaking the subroutines?

Hierarchical systems always seem seductive: my local experts fan out across the board, report to their local captains, captains report to majors, etc. Status reports flow up, and orders flow down. Each individual makes good decisions as to defense vs attack, etc, and produces nice plans based on his resources. It's a nice idea, and appealing to us that read war books. One problem is that war accounts are usually written by people who aren't dead.

Would it help to have a SETI@home size pool of compute power?

In theory, a good screensaver client could be successful. Go boards make for nice looking screensavers (better than FFT trash or big molecules.) Go programs would have better competitive appeal ("my local program is playing at 74%" beats "I've spent 3000 hours of compute time and found 0 aliens.")

Should a program care about how a position was reached?

At some level, a go player should be able to look at a board, and decide what the objectively best move is. Q: in pro games, how often is the move played close (e.g. with 3 points) to the previous move? How often would another expert pick the same move, or area, only seeing the board, not the previous move?

If the current power of computers is insufficient to find the best move, should we try to "use the opponent's computation" by also considering his previous move?

My gut feeling is No: it is better to build knowledge about the board, caching as much as possible from turn to turn, so that as areas become settled, we no longer devote compute power to them. Easy to say, but hard to do.

Concepts that must be represented in a Go program

These are somewhat arbitrary, especially because many concepts span multiple categories. E.g. I tried to keep concepts about local situations in tactical; more global ideas (that still suggest play) in strategic; and ideas that do not suggest a line of play in abstract. However, tactical is not purely local, nor is strategy always global. A cynic might say that tactical is what you do when you know what to do, strategic is what you do when you don't.

It is interesting that "eye", "snake", "dragon", "unsettledness" etc, do not appear in the word list. Is this because they are intuitive, common concepts (i.e. there is no single word to describe them,) or because they are really less important ideas?

Positional Evaluation

Initially, it seems obvious that position evaluation (is board A or board B better) is a requirement for a game playing program. I would almost not consider a program good unless it could perform relative analysis. However...

  • Tic-tac-toe (even in high-dimensional spaces) does not need deep position evaluation to play well (becuase the game is simple.)
  • Chess uses positional evaluation becuase
    • deep alpha-beta, etc, searching needs evaluation at the leaves
    • positional evaluation is a good predictor of game outcome (e.g. up a bishop will decide the game.)

Even though it is not clear that a computer Go prgram needs an explicit evaluation function, it seems useful because:

  • we can ask if it thinks it's winning or losing in a game
  • we can ask it which of two moves it prefers

In Chess, the natural unit of goodness seems to be a pawn. Note that this is a bizarre measure of goodness: surely the value of a pawn changes during the course of a game. In Go, it seems the natural unit is one point of territory. This is a more natural measure because it relates directly to the final score. Both games have a trinary outcome (win, lose, draw,) yet players have a continuum on which they judge how well they are doing.

If positional evaluation was everything, we would expect an expert playing a handicapped novice to resign aftrer the first move (i.e. the novice has more territory.) Because this obviously does not happen, it must be that a second order statistic is involved. Call it unsettledness, opportunity, volatility, choice, etc. The game of backgammon makes this concept explicit: a player may "double" the stakes requiring the other player to resign or accept double odds. If he accepts, the right to double is passed to him. Wall Street geeks will recognise doubling to be an exotic option.

(Note: choice and sente need elucidation)

Positional evaluation seems to depend on at least the following:

  • probable territory
  • unsettledness
  • choice
  • sente

Stuff not entered yet:

  • Board evaluation: territory and moves (as in amount of sente)
  • Influence maps and local propagation through patterns
  • Other map types (threat, area, life)
  • Continuous vs Discrete - even Death is continuous
  • Possible worlds, Territory, and Complexity


HolIgor: I wanted to create a go program too. I have written it. It makes moves and does not allow you to make an illegal move. I cannot count territory though, so the rules were quasi Chinese (it counts stones and one point eyes).

The next thing to do was to determine an algorithm. The random was the simplest. I used it while debugging the program.

Then I made several simple algorithms and matched them.

I had a champion very soon. I call it a crawler. The algorithm was simple:

  • make a move that produces a group with the biggest number of liberties.
  • don't play Damezumari moves.
  • don't punch your one point eyes.

Guess how it played. Slowly it made line accross the board. One large group. If there was an obstacle on the way it turned. Of course it was slow, but everything was connected and even for a human player it was not easy to kill this large but slow group.

For several months I could not write an algorithm that would beat the crawler. The problem was in connections. Algorithms left holes through which the crawler penetrated cutting everything into pieces and eventually strangling separate groups.

I have an algorithm that beats the crawler now. It has beaten IgoWin? first time when I matched them, but generally IgoWin? is much stronger. My program was lucky that time. It remains extremely weak. Perhaps one day I will have some new idea.

Gorobei: Crawler is a neat idea. Kinda like a young, smart kid playing Go for the first time. It's a great test opponent for Go programs: their play against it should reveal a lot about how well they understand making territory and eyes.

SifuEric: I have some ideas about how a program should play go. To me, I think it should try to mimic how a human plays. Chess programs do not play how humans play. They find the mathematically best move, either from a table or from calculation. Go is more than just winning; elegance is a major factor that can reveal the true talent of a good player. I think a Go program should have at least a model of human thought.

Here is what I have been thinking about: The computer decides a score for each player based on the current board. That score is a weighted sum of aspects of certain basic principles such as shape, influence, and territory. The weights for those things changed as the game progresses, because, for instance, shape is not as important in the endgame as it is in the rest of the game.

The computer would then pick several moves that it thinks could be its best move. It then does a search to some specified depth, going back to analyzing the board, then guessing moves.

If you have read Fluid Concepts and Creative Analogies by Douglas Hoffstader, you have an idea of how the computer analyzes features such as shape and influence and determines possible moves. Basically, good shape is stored in a network (a directed graph). Basic shapes (up to some small number, possibly 6) activate edges in the network probabilistically, and parallel agents follow the edges. Each edge leads to another shape. The missing stone or stones in the shape are guesses for the next move. Similar networks work for influence.

The real problem I had, and that I still have no solution for, is how to quantify influence. I don't even know if I really have a good grasp on exactly what it is. I think this will be hard (at first).

The computer would (and should) be able to learn. It seems like it would be impossible to populate the network by hand. The computer would learn by reading pro games and adjusting the weights of the edges and adding new nodes when necessary. Learning from pro games works because pros make more good moves than bad ones, so it could be unsupervised. It could also examine its own games by doing a deeper search of its moves.

Just my ideas.

See also:

SomePhilsophicalQuestionsAboutComputersAndGo
GoPrograms
GoPlayingPrograms



This is a copy of the living page "A Novice Tries To Write A Go Program" at Sensei's Library.
(C) the Authors, published under the OpenContent License V1.0.