KGSRating Math/ Discussion

Path: <= Rank =>
Sub-page of KGSRatingMath
  • [001] Chris Hayashida: I'm not sure that I read that the same way. The note in the change log made me think that k was changed after analyzing handicap games, but I don't think that it is variable now, only that it's been changed from 0.8.
    • blubb: Right, I have adapted the wording a little.
  • I agree with Chris. Furthermore, I think much of the discussion below is incorrect. The math page says "Now, we can treat RankA as a variable, come up with a graph of probability of all game results (prob) vs. RankA, and solve to find the rank for A that maximizes prob. This will be the rank assigned to A." This suggests that RankA is a float, what is usually called "rating". Also you can view a "rank graph" online in "User/View Info" which clearly represents a float. There are other indications that RankA is an integer, but they seem unconvincing to me.
    • blubb: Admittedly, the distinction between "rank" (integer) and "rating" (real) is homemade, in order to provide a clearer language. If you don't like to deal with integer ranks, just think about what we call "rating" here.

blubb: yoyoma, I have put some structure in the page. I hope that's ok to you. Probably the table headings aren't optimally worded for being headlines, but I didn't want to touch your text.

yoyoma: Ah great, I thought about doing the same thing! :) Well I was dreading updating this with exponential decays when I suddenly realized you can make all these calculations using some simple calculus. Next to add: a section about the myth that if you win 55% of your even games you will promote. NOT TRUE!!

BTW wms plans to add a "trend" feature to the rating system. This would attempt to see that someone is starting to win games at a high rate, and try to promote them faster. I didn't quite catch how he said it would work.

Mef: Just a semantics issue - but wouldn't things like 2.9 and 2.5 be ratings and not ranks? I thought ranks were always the solid number, 2d. This may just be picking nits, but I like things to be clear.

yoyoma: Yes. I made a little effort to be correct in that regard, but it wasn't my top concern. If you see any rank/rating mistakes please feel free to correct them. Probably just use rating everywhere would be easiest. I guess maybe the term "promote" is a little ambiguous there too because it implies rank... But maybe we can let that slide. :)

Mef: Hehe, ok if I see a mistake I'll try to correct it. Now I'm curious about that winning percentage thing. Would if be different to win 4 games against 2.5's than to win 2 games against 2.9's and 2 games against 2.1's? Because if this is so, then it may still be possible to promote with a lower percentage. Perhaps you just need a >50% winning percentage at your own rating.

yoyoma: Yes it would be different, I haven't figured out the math for playing games against people with different ratings. I will venture a guess that playing half your games with 2.1's and half with 2.9's will still require a win percentage close to 59%.

yoyoma: Ok I did the math, if you play half your games with 2.1's, and half with 2.9's, you need to win 59.63% to make 3d. This is slightly less than 59.87% with all 2.5's. :-))

Mef: Well, now that I know this trick, I'll be 3d in no time (=

blubb:

About "promoting": Maybe it would be helpful to specify (explicitely):

  • Winning suddenly 58% of your games, without handicap, with 5.5 (or 6.5?) komi, against an opponent with the same rating, increases your rating steadily, and after a while you will get promoted to the next rank.

yoyoma: Right, but typically you would expect something more like Mef's example above. If you're 2d and wanting to play other 2d's, you will mostly just take whoever comes and play them. So they would range from 2.00 to 2.99 with close to uniform random distribution. Again I don't know what impact that has, but I guess its still close to the 59% requirement.

  • Winning suddenly 58% of your games without handicap, with 5.5 (or 6.5?) komi, against an average opponent of your rank, increases your rating upto slightly below the next rank (as is shown in the KGS rating graph), but you won't promote to the next rank.

About the win ratio: The percentage of won games alone doesn't tell so much. If you, for an (admittedly, highly artificial) example, play 30% of your games against an effectively 5 ranks stronger player and win all of them, while losing the remaining 70% against effectively even opponents, you're supposed to get promoted rather quickly.

yoyoma: Yes, if your current rating is 2d and you handicap yourself as 7d and win 50%, your rating will go up. :-)

blubb: (-:

Concerning the introductional phrases: I can imagine it would be useful to apply the effective ratings paragraph to your tables, too. (Although it's not worded to be an introduction, yet.) It doesn't affect so much of your text - mainly the points 3 (which should be updated, btw), 4 and possibly 5. I am not sure about if you like that, because you might want to keep it as straightforward as possible. However, I think it can be done in a non-confusing way. For now, I'll put it up in the middle.

yoyoma: I replied to a few examples people put; I'm sure that Mef and blubb have a firm grasp on the math so this is more like violent agreement than anything else. :-) These "yes but what if..." cases are excluded from the tables in the simplifying assumptions section. I think we could add these examples to the main page by using footnotes to the assumptions section, which show the impact of that assumption (ie the one that assumes all games are against 2.5's -- average 2d's).

Mef: Indeed, I completely agree with your numbers so far (can't say I've gone through as much work as you have and have completely double checked all of them, but I didn't see anything that looked wrong). In fact, I think that the numbers seem pretty logical, since one would expect a 3d to be able to beat the average 2d in at least %60 of their games, it would seem rather odd if only 51% were the necessary demonstration of skill for promotion.

blubb: I think, the tables could gain additional value if even a dumb first time reader understood that the main part of assumption no. 4 is, to play the games against average opponents of his/her rank, while the "average 2d" assumption is just an exemplary interpretation.

etrynus Here is a question - say you have been playing with a -4.5 rank, and say that many months ago you played a game with an escaper at a handicapped rank of -10 (since you weren't as strong back then). Then, if the escaper escapes enough games so that you get credit for that win, can that somehow decrease your current rank?

yoyoma: Generally winning a game will not decrease your rank. One exception is if you've lost all your ranked games and then win your fist ranked game. For example, you lost your first game to a 11k, KGS makes you "12k?". Then you win your second game to a 29k. KGS makes you "20k?" now. Based on losing to 11k and beating a 29k this is very logical, but still your rank went down after a win.

Calvin: Maybe I'm missing it in the details, but where's the algorithm that gets that '?' to go away? It seems like I (user "ticktock") should have played enough rated games by now, but still I have the '?'. Could I be damaging my chances of getting a stable rating by accepting games from players with ranks that are too different?

yoyoma: The "?" goes away when the confidence in your rating reaches a certain minimum, but the exact algorithm for this is not published. Yes playing games that are not handicapped correctly will not increase the confidence of your rating as quickly as correctly handicapped games. But probably the main issue is that 4 of your 5 rated games were against ?-mark players. If your opponent's ratings are not high confidence then of course yours cannot be either. You also had some rated games back in May and earlier but those are not weighed as much since they are so old.

hammarbach: I am not a mathematics whiz, but I am a semi-regular player on KGS, and that is the problem I have. The 15 day half life sucks. I'm a father and husband and a guy with a full time job; I don't have loads of online time. Sometimes I get to play five times a week. Other times I get to play once every week to two weeks. Needless to say, I have been having trouble reaching the "29 kyu" escape velocity. a) Why have half life at all. Or b) Why not set it to something higher, like the 45 days the dan level players enjoy?


blubb: On 2006-09-22, 218.35.194.19 wrote the following on the main page, within the section that is signed by me. Since in my view, some points require clarification anyway, I`ve moved it here for now:

How weighting affects your rating.
If we assume the same number of games are played each day, then each day is considered to be one unit of spam. Saying that units of spam have a half life means that each unit of spam is worth a certain number of units of importance. It is therefore possible to determine how important any particular unit of spam is, or was, considering the sandwich.
Half life means games played in the first month will have approximately 100% weight, games in the second month approximately 50% weight, etc.
blubb: Where is the herein assumed half life time of 30 days stated?
     30 x 100 = 3000, * 100/5906.25 = 50.8%, / 30 = 1.7% per game
     30 x 50  = 1500, * 100/5906.25 = 25.4%, / 30 = 0.85% per game
     30 x 25  = 750, * 100/5906.25= 12.7%, / 30 = 0.42% per game
     30 x 12.5  = 375, * 100/5906.25 = 6.4%
     30 x 6.25   = 187.5, * 100/5906.25 = 3.2%
     30 x 3.125  = 93.75, * 100/5906.25 = = 1.6%
An interesting thing to note is that games played 3 months ago affect your rating by less than half of one percent. This might imply that six months is too long a period to consider the results of games, and that games older than three months should not be considered - the total weight of all games played in the three to six months ago range is about 9.2%, while the weight of games in the one month range is more than 50%.
blubb: Such a conclusion looks questionable to me. First off, reality often doesn`t even roughly meet the assumption that "the same number of games are played each day". Vacation, exams, family, business ... - a player`s game frequency can be affected by various circumstances, inducing arbitrary fluctuations. In idle times, the total relative weight of older games can be much higher.
Moreover, people take breaks. Consider e. g. someone who plays more or less continuously for, say, 2 years and then stops for 2 months. Your suggestion would lead to no more than a single month`s games left in the record, with little weight alltogether compared to new games. Such a player`s rank would inappropriately heavily depend on new games. If the break lasted 3 months, the system`s knowledge about that player would be even completely lost.
Of course, more exact math is needed if the half life actually causes games to decay day by day. This is just an approximation.
blubb: If anything at all, I`d rather ask for a longer span of consideration - theoretically, it should be infinite (still with a sensibly chosen exponential weight decay, of course). The six months limit is just a reasonable cutoff, to confine the size of the database to deal with.

Time settings

Question: Are there any plans to introduce a 2nd K-factor 'K2' which adapts the K-factor in the 'ELO'formula P{awins} = 1 / ( 1 + e^(k*(Rank{b} - Rank{a})) according to the length of the time settings? (If that's already the case, then forget about most of the entry below).

IMO, analoguous to the rank (where k varies):
k=0.85 for 30k-5k
k=???? for 4k-1d
k=1.30 for 2d+

it should also vary for the lenght of the game,
i.e. k(total) = k(rank)*k(time),
whereby - exact modalities to be discussed - shorter games get smaller k(time) values.

What is my incentive? I accepted ultra-short games (0 minutes, 10s per move) and lost on time several times (also by netlag in the beginning, when I didn't know what it was, and of course also by some bad moves due to lack of time and/or bad time management).

I know that I shouldn't have played with these crazy time settings in the first place.
This may sound like whining of a bad loser to you and s.o. might want to shrug it off with a comment like 'just play faster a/o better'.
However, the term 'whining' is not appropriate here - as I want to solve the problem.
Also I accept my losses and that the fact that I am not the fastest player or the one with the best time management.

I play very seldomly AND I want to use my standard account, i.e. not creating new ones.
AND I want to play even games which are intellectually challenging for me.

However, now I got a kyu-account and even with the tilde ~ behind it.
This implies that even slightly stronger players (than that kyu account) do not accept my challenges for a game and those few who accept tell me 'Shouldn't you play with weaker players?'.

Hence, what can I do?

  • Do I have to open a new account for each time setting which I want to use it for?
    • e.g. one for games 10s/move, one for games up to 12 minutes total and one for serious games 30-60 minutes? The few serious games I play on KGS are under league accounts.
  • wait half a year and then only play serious, long games?
  • play with people I know face-to-face and give 5-6 stones OTB (esp. in fast games, but Fisher or absolute) - yet which have the same KGS-ranking - to accept a 5-H game on KGS and win them? Would this speed up the progress? By how much?

(the one game I played that way was lost by me through netlag in a won position some moves before counting - sounds like moaning again - I know)

What could KGS do?

  • adapt the k-factor also to time settings.

(tderz)


Path: <= Rank =>
KGSRating Math/ Discussion last edited by tderz on October 3, 2007 - 23:24
RecentChanges · StartingPoints · About
Edit page ·Search · Related · Page info · Latest diff
[Welcome to Sensei's Library!]
RecentChanges
StartingPoints
About
RandomPage
Search position
Page history
Latest page diff
Partner sites:
Go Teaching Ladder
Goproblems.com
Login / Prefs
Tools
Sensei's Library