Archive of Doug's go blog for April 2004.
Table of contents | Table of diagrams Black to play Ko no ko, and sente Does this work? No. |
Went to a talk on monte carlo go, but still don't understand it. Finished KPA level 3, and reread Davies' Tesuji. Haven't quite given up on Segoe-Seigen level C, but having to lay out stones to understand the given answers seems like a bad sign. Discovered during a game (to my surprise and dismay) that bent four in the corner is not necessarily dead. And found a nice piece in Salon on the trials and travails of Wikipedia, which faces similar issues as SL, but more intensely.
For every proverb of the form, "X is dead", there is a corollary, "X has a whole bunch of liberties".
Not always true, perhaps, but still.
For every proverb of the form, "X is alive", there is a corollary, "X has a whole bunch of ko threats".
To distract myself from a disastrous series of games on DGS (motto: play slow, lose slow), I've been thinking about ratings. A rating, as distinct from a rank, is supposed to say how strong you are, i.e. to predict the outcomes of new games. There has been a lot of work on rating systems over the years. The most famous system is undoubtedly ELO, developed by Arpad Elo in the 1950's and 1960's. Originally applied to chess, it has been used all over the place. In the go world, the EGF uses an ELO system. Other systems include Glicko, (developed by Mark Glickman, stats prof and chair of USCF ratings committee, also see the technical paper), which seems to be an ELO variant, ChessMetrics, another ELO variant, HoligorSRatingOfGoPlayers, which I don't understand at all, maximum likelihood methods such as mlrate (used by NNGS and DGS) and AccelRat (used by the AGA and maybe IGS), and I'm sure others as well.
All of these systems are in some sense statistical, and rely on an assumed probability for a win given the strength difference between the two players. You can think of this as the definition of a certain amount of strength difference. Several formulas are in use. KGS uses p = 1/(1+exp(k x)), where k is an arbitrary constant and x is the strength difference, which is called the Bradley-Terry model. MLrate uses p = 0.5 s^x for x positive, where s is a constant. This has the same large x behavior (single exponential), and is continuous but not smooth at x=0. AGA uses a normal distribution, just like Arpad started with, p = 0.5 exp(-x^2/2 sigma^2) for x positive, sigma a constant, which decays faster at large x, but is again not smooth.
Any reasonable formula is a monotonic mapping of a probability (range 0 to 1) onto some other number. Since any such mapping is (just about) invertible, one might think that the actual formula chosen is entirely irrelevant. It's not, however: these formulas implicitly encode some kind of transitivity relation, because the formulas talk about ratings differences: if A beats B x% of the time, and B beats C y% of the time, corresponding to ratings differences delta AB and delta BC, the chosen function predicts how often A will beat C from delta AB + delta BC. Different functional forms will predict different values from pAC. There remains an arbitrary scale factor. In games without
handicap, such as chess, a scale is picked purely esthetically: 200 points means 75% winning. For go, we can use the strength difference from handicap stones to set the scale.
To see what these functions look like, I plotted the KGS, MLRate, and AGA functions as solid, dashed, and dotted lines, using their default parameters (k = 0.8, s = 4/19, px_sigma = 1.04).
One question is whether these probabilities are a function of absolute strength. One dataset is the European tournament results. This plot shows those results, with 95% CI error bars, plotted vs. the average strength of the two players, with 1, 2, 3, and 4 stone strength difference games shown in blue, green, red, and teal. The predictions of the rating functions with default parameters are shown in the same
colors, with KGS solid, MLRate dashed, and AGA dotted. All do okay for dan players, except maybe AGA is a little too pessimistic about the chances of a player weaker by three or more stones. None are correct for kyu players, where the weaker player wins much more often than predicted. This could be due to more variable play, more misrated players due to improvement or fewer games, or even weaker use of handicap stones making ranks closer together. Whatever the reason, the effect seems to be substantial.
Once you have a probability function you are happy with, you face the problem of fitting rankings to the results. This is a nontrivial problem, with issues like dealing with rating improvement. The ML methods typically try to fit a single current rating from the entire history of results available, with improvement dealt with by deweighting old results. This will track improvement, but slowly, and thus the usual advice to start a new account in these systems if you want a quick look at how much you've improved. It's also possible to include variations in time in the rating, but (according to Glickman) this becomes computationally expensive. I'd also worry about having sufficient data to model the time dependent function. ELO methods take a step towards the minimum based on each new result. This may not take full advantage of the data available (in particular, my opponent's future results can't change my rating), but it does track rating changes in a natural manner, and it is highly predictable, which players seem to like.
harry wang - I like the graph, I wonder what tools you used to plot it. I want to get some go game result histograms into this format.
I used matlab "plot" and "errorbar", with CI's computed with binofit from the statistics package. You might also try octave (which uses gnuplot for figures) or scilab, or even a spreadsheet. Email me directly if you want more specifics.
From the "almost errata" file. The given answer is a. I think b works too, but is a point or two worse.
ethanb: b makes a ko for life - definitely not what black wants... Actually, I take it back. at would live, but now black can never push and cut white at regardless of the board situation.
rubilia: ( was played where is now.) Maybe I am wrong, but I don't see why b should have to result in a ko. After white can't cut at 5 because of shortage of liberties. The main difference between a and b is gote vs. sente, so b seems to be the better move. Do I miss anything?
You're right that there's no ko, but White need not answer (at a cost of a couple of points maybe), so Black still doesn't get sente, and can't push and cut without making an extra move.
rubilia: If there's no additional hook, that still means b is the preferable move here.
It's not clear to me which is better: remember that it costs black a move to pick up those two stones, plus she loses the push in sente, so I think b gives a three point gote followup, or 1.5 pts better. a however leaves a defect in White's wall, which could be worth a lot more than a couple of points.
So, I'm working on tesuji. Here's a tesuji from a game. White could capture two stones, but wants more. is a double hane, and after , has a beautiful squeeze...
...and can capture all the white stones, and save his two threatened stones, ten moves deep from the cut in the original diagram. A very nice tesuji to find during a game. Too bad I was White, and missed it. Black isn't reading any tesuji books, but knows how to fill liberties and found it.
The Segoe Tesuji Dictionary arrived a couple days ago, three volumes, ordered from Amazon.jp in a fit of insanity when owning more books seemed like a good idea. And a new copy of Tesuji came too. Too many years since I've seen my old copy. The Segoe dictionary is beautiful, lovely Japanese printing with not one but two ribbons in the binding, one to mark your place in the problems, the other in the answers. I find the level C problems hard, and rare: most of the problems are A or B, and way beyond me.
They say learning tesuji is the first step to getting better. I still feel like I don't know any tesuji, but that can't be blamed on my library. And if I have to know everything in these books, I'll never get to step two (whatever that is), but maybe I won't need to.
TesujiFreak: Congratulations on getting the Segoe Tesuji-Jiten. I got it some time ago, and I think it is the best go book purchase I have ever made. Some of these problems are simply beautiful with so many subtleties it blows my mind. They also have a "natural" feel to them - as if they could actually appears in real games. Great books.