KGS Rating Math

Path: Rank   · Prev: RankAndRating   · Next: RationalesBehindRatingSystems
    Keywords: Theory, Online Go

Table of contents


Changes for KGS 3.0

For KGS 3.0, the KGS rating system turned back to the old formula (It means the minimum win probability introduced in 2.6.8 is back to 0) with the 2.6.8 constants.

  • k varies depending on your rank:
    • k=0.85 for 30k-5k
    • k=???? for 4k-1d
    • k=1.30 for 2d+
  • The half life of games varies depending on your rank:
    • 15 days for 30k-15k
    • ?? days for 14k-1k
    • 45 days for 1d+

Introduction

There is some documentation about KGS ranks on these pages:

Basically KGS assumes that the expected win rate of two players is:

P(A wins) = 1 / ( 1 + exp(k*(RankB-RankA)) )

where k varies from 0.85 to 1.3, depending on the ratings of the players, and the RankB-RankA is adjusted by 1 for every handicap stone and a small amount for each point of komi.

Using this formula, KGS constantly recalculates the most likely rating for every player based on the games they played. Old games are decreased in weight exponentially with a half life ranging from 45 days to 15 days (weak players have short halflifes so their old results don't affect them as much). Games older than 180 days are not considered.

Example expected win rates

         expected
  rank     win
  diff     rate
   0.0     50%
   0.5     60%
   1.0     70%    (k=.85, for 30k-5k)
   1.5     78%
   2.0     85%
   2.5     89%
   0.0     50%
   0.5     66%
   1.0     79%
   1.5     88%    (k=1.30, for 2d+)
   2.0     93%
   2.5     96%

Handicapping ensures that you win around 50% of your games if your rank is accurate. However, there is quite a bit of room within each rank for some players to be quite a bit stronger than others of the same rank. Also, the handicap system actually favors white by 0.5 stones. In fact at the extreme end, suppose you are 2.99 dan (on the verge of 3d) and play a 1.00 dan (a very weak 1d). KGS will suggest a H1 game (komi=0.5), but this still makes an effective rank difference of 2.99-1.00-0.5=1.49, and white will be expected to win 88% of the time (78% for 5k and below). This is an extreme example, but even if you play all your games with players of your rank you will typically need to prove that you are 0.5 stones stronger than them to promote. So for 2d+, you need to win 66% of the time, 60% for 5k and below.

Tables on rank response

Here is some math showing how your rank on KGS would react to you being a 2.5 (average 2d) and going from a steady 50% win rate to some other win rate.

Some assumptions / simplifications implied:

  • This is based on the information at [ext] http://www.gokgs.com/en_US/help/math.html, and a few things wms told me directly:
    • The weight of a game on KGS decreases exponentially over time, with a half life of 45 days. In this I dropped all games older than 180 days (is this right?)
    • KGS uses k=0.8, giving a 69% win rate in an even game between players 1 rank apart.
  • These are all assumed to be even games against a 2.5 (an average 2d). It makes a big difference if you play a 2.0 (weak 2d) or a 2.9 (strong 2d).
    • A game between an average 2d and average 3d at 0.5 komi does not yield a 50% win rate. It's actually 60% for the 3d since 0.5 komi is only a half stone handicap. Generally in any handicap game, white should win 60% of the time. Again this example just assumes all even games for easier math.
  • Does not take into account the movements of other players, which has an impact on you (ie "rank drift").
  • Does not take into account the "confidence" factor, which can cause games with players of uncertain ranks to have less weight.
  • Assumes you play rated games at a constant rate. Note that with this assumption, it does not matter what that rate actually is, except that it is great enough to make KGS' "confidence" factor not have any impact.

Table for suddenly becoming 1 stone stronger:

Suppose you played for 6 months as a 2.5 (an average 2d), and then suddenly became 3.5 (an average 3d) in strength and therefore started winning 69% of your games. In 45 days kgs will rate you a 2.5+0.51=3.01 (weak 3d).

    --------- Number of days played at new strength
   |       -- Increase in your rating
   |      |
   0    0.00
  15    0.21
  30    0.38
  45    0.51
  60    0.62
  75    0.71
  90    0.78
 105    0.84
 120    0.89
 135    0.93
 150    0.96
 165    0.98
 180    1.00

How many wins in a row to promote?

Assume you played 1 game a day as a 2.5 (average 2d) for 180 days. Then you get inspired and play a whole bunch of even games in one day and win them all. How will those games affect your rating?

   ---------- Number of games won in a row
  |        -- Increase in your rating
  |       |
  0	0.00
  1	0.04
  2	0.08
  3	0.12
  4	0.15
  5	0.19
  6	0.22
  7	0.26
  8	0.29
  9	0.32
 10	0.36
 11	0.39
 12	0.42
 13	0.44
 14	0.47
 15	0.50

So it will take 15 wins in a row to promote.


Effective Ratings

Where this page speaks about "ratings", it doesn't refer to standard ratings of players, but to effective ratings. (This also applies to strengths, which are defined in the next section.) Now, what does that "effective" stand for?

Let's say, your opponent is a strong 3k (-3.1 AGA or 1840 EGF). If you play him with an advantage of 0.5 stones (that is, you take black) and 5.5 komi (this default komi is debatable), his effective rating is also -3.1 AGA or 1840 EGF.

But if you play with different handicap and/or komi, your opponent's standard rating must be adjusted in order to represent the changed playing power s/he owns in that game. For each additional handicap stone that you give to the opponent, his/her effective rating gets one rank stronger. If it's you who gets handicap, his/her effective rating decreases by one rank per stone. The same applies to komi, as far as it differs from 5.5 to white (again, this number is debatable): your opponent's effective rating increases/decreases by 1 rank per 11 komi points you give/take.

So, if you play white, giving 4 stones (which gives black an advantage of 3.5 stones over white) while still taking 5.5 komi, his/her effective rating in that game is (5.5 - 5.5)/11 + (3.5 - 0.5) = 3 grades better than his/her standard rating. The mentioned "strong 3k" player will effectively be a strong 1d (1.9 AGA or 2140 EGF) then to you.

If you play white and give your opponent 4 stones, but without komi this time, Black's effective rating is (5.5 - 0.0)/11 + (3.5 - 0.5) = 3.5 grades stronger than his/her standard rating. S/he will effectively be about an average 2d then (2.4 AGA or 2190 EGF).

Usually it's considered best choice to have handicap and komi make your opponent effectively as strong as you are. However, it's possible to use different, even weirdly different, settings. E. g., you can give a stronger player handicap. The tables and formulas at this page still apply as long as you keep in mind to use the effective ratings (or effective strengths, respectively).


The Math Behind, Made Easy

blubb: I like the idea to create a "KGS Rating Math" page, and I am quite sure it can help the rating system to be better understood by people who want to go beyond [ext] http://www.gokgs.com/help/ranks.html, but feel deterred by the somewhat cryptic explanation at [ext] http://www.gokgs.com/help/math.html. (Well, hehe, at least I hope we can supplement the usefulness of the latter one. :)

The most basic way is to present tables of values like you can see above, even if they can show some exemplary cases only. Another attempt is to simplify the math formulas as far as possible, which is what I had in mind when I thought about creating a page like this in the past. Now, that yoyoma has gone the first step, I will contribute, too.

Strength of Players

First, one can define a "(linearized) strength" s of players:

             s :=  e ^(k*r) ,

where r is the common (logarithmized) rating. At kgs, k = 0.8 gives

             s  =  2.22 ^r .

Using this, the winning probability of player A against player B can be expressed as

                        s (A)
       P (A,B)  =  --------------- ,
                    s (A) + s (B)

This equation should be easily understandable to many people. It says, roughly spoken:

If A is twice as strong as B, s/he is expected to win twice as often against B, as B is against A.

(Important note: This refers to pure strengths. In non-even games, effective strengths have to be used here, as calculated from effective ratings.)

Rating Calculation

When looking at the rating calculation procedure (which is a so-called maximum-likelihood estimation), it turns out that it can be transformed to a rather simple formula as well.

Let player A's record consist of N games with various opponents (B1,...,Bx,..., BN). (It doesn't matter if the opponents are all different or, e. g., B4 is the same person as are B5 and B11. They're just called according to the order of games.)

The players B1 to BN have the (linearized) strengths s(B1) to s(BN), respectively.

Let qx indicate the result of game no. x as follows (ties aren't token into account here):

           qx  :=  1    , if A won the game,
                   0    , if A lost the game.

Similarly, we write rx for the winning probability of player A in game no. x:

                                     s (A)
           rx  :=  P (A,Bx)  =  ---------------- .
                                 s (A) + s (Bx)

Now, what does it mean to calculate the strength of player A? Because rx depends on A's strength, we can call it a function of s(A), in other words, rx = rx (s(A)).

The correct strength value then is the one which ensures that the equation

      avg (rx)  =  avg (qx)

is true.

That is:

The correct strength makes the average winning probability value equal to the overall winning ratio.

Game Result Weighting

In fact, the above paragraph doesn't cover how ratings are really calculated. If B8 for instance is an opponent who has not been playing for a long time, it's rather unsure what the result of game no. 8 actually tells about A's strength, because B8's strength itself is very unsure.

That's why so-called "weights" wx are assigned to the games. They're intended to take account of each particular game result according to it's "meaningfullness". A weight is a number between 0 (games which are not evaluable at all) and 1 (most evaluable games).

So, including weights, the correct strength value ensures that

 avg (wx * rx)  =  avg (wx * qx)

is true.

By taking wx, rx and qx for vectors, that is

                   _    _   _ 
             0  =  w * (r - q) .

Currently, weight depends on

  • the time since the game has been finished
  • the opponent's current rank confidence.



By the way, whilst the common kyu/dan ranks get the harder to increase the better you are, the strength behaves a little more according to the effort you spend on improving. However, there's a big drawback: it's irksomely difficult to tell the appropriate number of handicap stones between two players of given linearized strengths.



General leap response formulas

Suppose

  • a constant playing rate
  • a certain, constant mix of opponent levels.

Before the leap,

  • your performance was constant over a long time
  • your (weighted) win ratio was stable at   q0
  • your strength was in its equilibrium at   s(0).

But suddenly,

  • your (weighted) win ratio switches to   q1
  • your strength starts moving like   s(t).

Please note that the following formulas are continuous approximations that fit the better the higher the (constant) playing rate.


Unchanged opponent strength

If the mix of opponent strengths remains the same as before, and you manage to keep the new win ratio, your recognized strength gradually changes according to


           s (t)    1 + qW * E(t)
           ----  =  -------------
           s (0)    1 + qL * E(t)
           where
               t     =    t   -    0               time passed since your sudden improvement
               qW    =   q1   /   q0               quotient of win ratios after and before the leap
               qL    = (1-q1) / (1-q0)             quotient of loss ratios after and before the leap
               E(t)  = 2^(t/t_half) - 1            aging factor


Example:

Calculating your rank development with k = 1.3, a half life of 30 days, an old win ratio of q0 = 0.5 and a new win ratio of q1 = 1.0, we get

 qW    =    1.0   /  0.5    =  2.0
 qL    =    0.0   /  0.5    =  0.0

and thereby

                          1 + 2.0 * E(t)
 r(t) - r(0) =  (1/k) ln( -------------- )
                          1 + 0.0 * E(t)
             =  ln(1 + 2.0 * (2^(t/(30d)) - 1)) / 1.3

which simplifies to

 r(t) - r(0) =  ln(2^(1 + t/(30d)) - 1) / 1.3



History of major rating algorithm changes

(We could fill in dates on these and a summary of the changes) pre 2.6.4 2.6.4 3.0


Related pages

KGSRatingDiscussion RankWorldwideComparison


Path: Rank   · Prev: RankAndRating   · Next: RationalesBehindRatingSystems
This is a copy of the living page "KGS Rating Math" at Sensei's Library.
(OC) 2007 the Authors, published under the OpenContent License V1.0.
[Welcome to Sensei's Library!]
StartingPoints
ReferenceSection
About