Range Of Skill

   

evpsych: What is the range of skill in Go?

Here is a new method of determining rank -- not for use in determining handicaps.

Suppose you take the best players of our time as one end of the scale, and, say, a Go-naive group of engineering students from the top 10 American universities as the other end. The point is to have known endpoints.

Now, for each student you teach the rules the same way, and play 10 games with each other with an expert only giving rules advice, without letting the others watch. How strong do players need to be to beat 90% of them them 90% of the time in an even game? That's new-rank 1. It might be, say, 25 kyu AGA -- just guessing. (Adjust as appropriate.)

Note: no handicaps here. Even games.

Then: how strong does a player need to be to beat new-rank 1 players with the same parameters? That's new-rank 2.

Finally: how many new-ranks are there between the engineering students and the best players?

You might wonder why do such a thing. Answer: now we can compare to other games. How many new-ranks are there in Go? What about other games?

--evpsych


exswoo:I'll go out on a limb here and approximate about 10-12 ranks in Go if we were to take a 90% win ratio instead of a 50% one. I'm getting this estimate from the games I've seen and played where an even game played between players 4 stones apart were almost a guaranteed win by a stronger player. Since we get about 40 ranks in Go if we take all the amateur ranks and convert the pro ranks to amateur levels, I'd say somewhere in the range of 10-12 is a good bet.

I guess the mathematical formula(if you can call it that) that I'm using is : .50+(rank difference)(.1)=Prob of Winning. Of course, I have absolutely no stats to back me up ;)


Confused: Judging by the games on KGS, I'd say that in the double digit kyu range, a difference of 5 kyu makes very one-sided games. This is assuming, that the ranks are corresponding to the actual skill, which increases very fast at this stage. At this level that big insights still can be achieved quickly and produce noticeable effects.


WilliamNewman: Note that you would tend to find finer gradations of skill if you played best-of-N matches instead of single games. I thought about the levels of skill in different games a fair amount last year, since I was teaching a strong Chess player Go while he was teaching me to play better Chess. I ended up saying a single game of Go is roughly comparable to a three-game match of Chess in several ways, including this one. You might have more than a 10% chance of taking a single game of Chess from someone who's significantly stronger than you, so I've seen people say Chess has fewer levels of skill than Go. But trying to win best-out-of-three against someone who's significantly stronger in Chess is hard, and if you measure skill levels this way it seems to me that the number of skill levels is fairly comparable between the games. (Other ways that Chess feels about three times smaller than Go are the time required to play a game, and the way that Chess just feels tactically cramped -- a knife fight in a phone booth -- while a Go board usually has several different battlefields.)


I saw this analysis done elsewhere, and a figure of 66% or 75% was used. 90% is too large, as it does not allow for enough difference. Here are precise defintions:

Two players have a rank difference in skill, if the better player wins at least 2/3 of the time.

The complexity of the game is the total number of ranks, starting from an average person who has just learned the game.

Tic-tac-toe, for example, has exactly 2 ranks. You must be careful how to define "better player wins 2/3 of the time". For example, in Poker, you may require "better player is ahead 2/3 of the time after 100 hands", because no player will win 2/3 of the time on a single hand.

However, I have never seen a careful rank analysis done for a large number of games. Go would be at or near the top of the list.

zinger: using the "win 2/3 of the games" rule, I would guess around 20 ranks, along the lines of: beginner-30k-25k-20k-16k-12k-9k-6k-3k-1k-2d-4d-6d-7d-1p-5p-9p, demigod, god. As you can see, I expect larger traditional rank gaps with weaker players, and smaller ones with stronger players, due to differences in consistency. Also I think there are at least two ranks nobody has achieved yet.

Tas: The top level should be the best humans, not god. I think there are more likely 10 or twenty ranks between profesionals and god than two - even though it may be true that it is only 3 stones. Actually since the standart deviation of playing strength aproxes zero for rising strength, the number of ranks migth be nearly infinite, if you include "god". OneWeirdDude: Zinger, I'm not sure you completely know what you're talking about. God, being perfect, beats everyone else without a handicap. The rest might be accurate, though.

Matt Noonan: There was a conversation about this on rec.games.go about a year back. I liked this idea of rank a lot until somebody posted this example; now I am not so sure:

Define a new game n-chess to be a series of n chess games, with the winner being the person who wins the majority of these games. By making n large we can seemingly give n-chess as many ranks as we want since the n parameter magnifies differences in skill.

So do we really want to say that 3-chess is roughly equivalent to Go in complexity? Intuitively, I'd want to say that 3-chess has the same complexity as chess since the games don't interact. Help us, O gods of CGT!

Alex Weldon: The reason for this seeming paradox is the notion that the number of ranks is proportional (only) to complexity. Obviously, it's related, but complexity is not the only factor. Things like that n factor, randomness (like in poker or backgammon), etc. are also variables in the function that determines number of ranks.


The problem is that there's no time restriction in your example. "Play 1 hand of poker" is bad for determining rank, because the time interval is too small. Play best-of-n is bad for determining rank if n is too large, because small differences are exaggerated. So, here's the improved definition:

Two players have a rank difference in skill, if the better player wins at least 2/3 of the time in a 4-hour contest. If the game naturally lasts less than 4 hours, play "best-of-n" so that the expected length of the contest is as close as possible to 4 hours.

Notice that "4 hours" is a parameter in this definition. As long as the time interval is the same for all games being compared, it is a fair comparison.

So, it's somewhat unfair to directly compare a single game of chess to a single game of go, because the game of go tends to last longer. If you set the time control so that each game had the same length, then a direct comparison would be possible. Also, chess has a more significant advantage to the first move.

(Also, this discussion is completely unrelated to combinatorial game theory.)

mgoetze: This would seem to suggest that lightning go is a much more complex game than regular go. Let's say you have a 1d EGF player and a 3d EGF player. If you gave them two hours each to play a regular game of go, the 1d would have about a 25% chance of winning (based on [ext] these statistics). But if you had them play best-of-11 lightning matches (10 minutes thinking time each), I am sure the 1d's chance of success would be significantly less than 1%, unless the 1d happened to be a lot more experienced at lightning go than the 3d (and even then I would only give him a 5% chance at most). Since I consider this result absurd, I'm afraid I must reject your method. :)

kochi? are you so sure about this? I think restricted time limits magnify differences. When playing 3 month time limit games on dragon go server, it's really hard for W to give proper handicap for ranks derived from faster game settings.


This leads another interesting point to the discussion of rank: "How sensitive is the game to time adjustments?" A game that is very sensitive to timing changes probably has a higher complexity than a game that isn't sensitive to timing changes.


Another potential flaw is that you can have loops. Alice beats Bob, Bob beats Carl, and Carl beats Alice. The solutions is that when you define "rank of a game", you ignore loops. You define the "rank of a game" to be the largest chain where the nth player beats all lower-ranked players. Or, divide the pool of players into groups, where each player in a group can beat all players in the lower groups. Within a group, any outcome is possible.

Another flaw again is that you use engineering students as some sort of benchmark.


See also:


This is a copy of the living page "Range Of Skill" at Sensei's Library.
(OC) 2012 the Authors, published under the OpenContent License V1.0.
[Welcome to Sensei's Library!]
StartingPoints
ReferenceSection
About