13. Rating, Ranking, Player's Strength
Preface
Here we collect ideas for improvements and feature requests concerning the KGS ratings (ranking, player's strength, Kyu, Dan).
If you are a first time user:
If you just used KGS for the first time, these wishlist pages may be quite confusing to you. Your experience, however, is very valuable and you can help to improve KGS by adding it to KGS First Time User Experience.
Unsorted ideas and requests
Games vs robots: should not be factored into the "~" rank designator
- Stormer: Games vs robots should not be factored into the ~ rank designator. Saw a guy the other day with 90% of his games vs gnugo and he had a ~ rank that was probably not deserved.+
- Ectospheno: I don't see how the stronger opponent being a robot negates the fact that he only plays stronger opponents. The fact of the matter is that he isn't playing lower ranked people and I think that more than justifies the ~.
- Stormer: 2 players, both beginners, around 18kyu. Lets say both are learning to play 19x19, and are practicing vs gnugo a lot. Player1 is generally a helpful person and plays 9x9 in the beginners room with weaker players fairly often. Player2 Is nervous playing humans and just trying to get the hang of playing vs gnugo. Both will have ~. Are these the people you want to avoid giving teaching games to?
- Ectospheno: They don't have to play ranked games against gnugo. If they just want practice and are still nervous they can play free games against the bots and never worry about the pesky ~ symbol. I maintain that if they opt to play ranked games then they have an obligation to play ranked games against weaker opponents as well. Otherwise they shouldn't be playing ranked games. I recognize that intelligent people can disagree though -- just my opinion.
- Stormer: Well, we certainly can disagree, but I thought the underlying philosophy was that if you receive a teaching game, you should also teach others. I don't think playing gnugo counts as being taught, furthermore it doesn't take up any humans free time. Also, it is difficult for many players without a solid rank to get rated games, and many are now encouraged to play rated games vs bots in order to get a solid ranking. I guess I just don't think that playing gnugo should have any impact on ~, only games vs humans.
- Ansgar?: I agree with Stormer here, playing against robots doesn't take anybody's time to teach/play weaker players and thus shouldn't give a ~.
- dalf?: Both points of view are interesting, but another point should be considered: the more you play against a program, the strongest you become against it. Not (only) because you are making progress at playing go, but because you are learning its weaknesses. Hence, a player repeatedly playing against Gnugo is likely to get a distorted rank (and to learn bad habits). For instance, I am myself ranked one stone lower than gnugo, but I can beat it pretty consistently.
- RiffRaff: This has been bothering me recently, also. When I first signed up with KGS, I had no idea what my rank should be so I ended up playing a number of games against various bots to try and narrow down what my rank should be (rather than inflicting unbalanced games on actual people). Once I established a fairly confident non-? rank, I started playing humans but now it seems that I'll be saddled with the ~ designator for some time to come due to the fact that almost every one of my games against the bots was with black. There's already a designator on those accounts indicating that they're bot accounts, so it seems like it would be easy (and what's more, the right thing) to ignore those games in determining what accounts deserve a ~ designator. On the flip side of this (which I don't think any of the above discussion has touched on), it'd also be currently possible for someone to get rid of a (deserved) ~ designator by playing games against weaker bots. Games versus bots shouldn't count either way, since there's obviously no "teaching" involved on either side.
- I am perfectly happy to play games as W with ~ players, and guess I am not completely alone, so don't worry too much ;-)
Ranks: Why must kgs stop at 9 dan?
- Why must kgs stop at 9 dan ? Can't there be a 11 dan that is 2 stone stronger than a 9 dan ?
- LithiumTwo: Most people (if not all) who are above 9d on KGS seem to be pros, and these can already get special ping ranks.
Ranks: Why must ranks on kgs stop at 30k?
- Why must ranks on kgs stop at 30k? The ranking system has gotten tougher. Even looking at the old [[KGSRankHistogram]], you can see that there's a spike at 30k. Is it possibly bigger today? There's also been a growing number of weak bots that hang out in the beginners room. It's tough for new players to set handicap against them because they all show up as 30k even though there's a wide range of strength differences.
- Maybe track ranks over the larger range but display them with a cap? That could at least allow a "fair" handicap by default
Rating: 9x9 and 13x13 games rated (REJECTED)
- marc: make 9x9 and 13x13 games rated! this is very important for beginners. for stronger players who are not-so-used to these board sizes, weight the rating variation as much as you like. also, perhaps allow only rated games for these board sizes to players up to 20k +
- wms: Not practical, for the same reason ultra blitz is no longer rated. 9x9 and 13x13 are different enough that they don't belong in the same rank system as 19x19, and I have no plans on setting up multiple rank systems on KGS.
- Jonathan Cano: Drat! I was hoping to achieve a 9d rating for 3x3 Go!
- joelr not with the right komi.
Rating: komi and handicap changes don't influence the resulting rating changes
- blubb: I am not sure if it's a bug: when starting a game, not only komi (according to the chosen ruleset) but also the resulting ranking value of the game doesn't seem to be adjusted to any changes made by the challenging player in the game setup window. Recently, a 6k? was suggested (by kgs default) to take white against me (8k) with 2 handicap stones. Though, he took black himself with 2 handi, since he thought to be around 10k. He won the game and became 4k? which seems much too high a jump for his actual win, even in the ?rank modus (but appropriate if the game was scored with the - not applied - default settings). Yet I couldn't verify this with more accurate ranked players because the influence of a single game's outcome to the rank is much lower then.
Rating: differentiate between resignation and loss by some points
- Jraitsev: I am not sure if this makes any difference in computing the rank or not, but often times players who lose by a sizeable margin, click 'undo' right after they see by how much they lost and than resign. Again if resignation and losing by 1/10/50/100 points bears the same weight in scoring, it makes no difference, however if this is not the case, someone may use this to abuse the system. Perhaps KGS should differentiate between resignation and loss by some points.
- blubb: As far as I know, kgs doesn't distinguish between several margins or ways of winning, and I don't think it should. In traditional go, a win is just a win and a loss just a loss, no matter of the points difference, and therefore playing strategy is not intended to maximize the expectation value of points difference but to maximize the probability of winning.
Too Great Influence of a Player's Game History
Current situation (2006-11-27):
- For the past 180 days, a player's entire history of his played games is reevaluated to determine his current rating.
- The weight of a game drops exponentially with time. A related parameter is called "halflife" and currently set to be 45 days.
- The more games a player played before, the harder it is for him to change his current rating.
Criticism:
- The influence of a player's game history is too great. Frequent players even say: by far too great.
- Ratings get stucked too easily and then do not represent a player's current playing strength.
- Players unintentionally have to play as sandbaggers or have to create new accounts just to overcome this design problem of the rating system. (KGS Plus members cannot even easily create new accounts.)
- The rating system is unfair towards players playing more frequently than others. Some people think that a rating system would be fair about this aspect if at any time every two players with the same current winning achievement had the same chances to change in rating.
- The rating system is designed assuming that fast increments in playing strength do not occur - but everybody knows that they can occur. This makes the rating system unfair towards temporarily fast improving players. A rating system should model all players equally well - not only those with rather stable playing strengths. Otherwise the rating system does not model reality but expect the players to behave according to the system's own design.
Suggestions for a possible solution:
- cocoon: Replace the dependency on the number of days since when a game was played by the number of games played after every particular game.
- Anonymous, Harleqin: Consider both the number of days since when a game was played and the number of games played after every particular game. For a player at the current moment, choose the faster rating change of these two.
- Harleqin: There should be the additional factor e^(-(ln2/X)*games) to the weight, where X would be an analogon to the halflife by time and 'games' the number of this player's games since.
- Anonymous: Decrease the dependency on the number of games a player has played in the past. Decrease the period of including old games. Decrease the halflife.
- Iago: Value the last 25% of a player's played games more.
- Anonymous, Phelan: Allow a player under a particular user name to reset his rating by starting afresh as "?".
- GammaTau?: The time it takes for old games to expire should not depend on time alone but also by calculating how well the old data fits with new data. The weight of old data should be determined by comparing it with recent data. If old data and new data give a different representation of player's strength, the old data should not be used to measure player's strength. On the other hand, if the old data and new data give the same representation of strength, it's perfectly fine to use it.
- Iago: Limit the number of games that are reevaluated for a player's current rating.
- wms: General requirement for every possible solution: The rating system must remain stable globally.
- HonFu: Maybe I have an idea to at least solve one of the mentioned rating problems without destabilizing the global rank situation.
As I have written about in "How to get along with KGS rating math", the main problem of ranks "getting stuck" occurs when players switch their playing rhythm, for example from playing many blitz games a day to playing the occasional slow game. This happens, because only the number of wins and losses is evaluated. Since KGS works with TIME (180 days), not games, this seems a bit inconsistent.
My attempt for a solution: Do not take the results of every single game into account, but take the average result of every single play day into account!
This could work as follows: If someone plays nine games a day, then not every single result would be counted, but the system would calculate his day average. And only this day average result would be counted in the future.
Like this it would not matter at all, if you play one game a day or two games or a few dozens. The slowness or speed (as you like) of KGS rank movement would be the same for everybody, not depending on the actual number of games. The time-based evaluation system, which I think is a good idea to begin with, would be able to do its job even better.
- RobertJasiek, bitti: Allow (much) faster rating changes (at least for the dans).
- RobertJasiek: Do not use the history of old games at all. If there is no majority for this, then at least weigh the old games much less and avoid the direct time dependency so that playing frequently in the past is not a barrier for improving in the present.
- RobertJasiek: The rating system must model reality; in particular it must assess temporarily fast improving players correctly. The rating system must avoid the contrary: that players would slow down their improvement just to please the rating system's assumptions.
- RobertJasiek, bitti: The rating system should be (much) more transparent and predictable on the surface also for ordinary players.
Discussion:
Links:
Anchors
Real World Anchoring
- aokun: This is not so much a software request as a process request. Could we have more aggressive or widespread anchoring of the ratings to the outside world. I know there is a problem of which outside ratings to pick, which country, which association, but the Rank - worldwide comparison chart indicates that KGS rankings are significant different -- always tougher -- than any other system. It would be nice if KGS were more in the middle of the pack. For some people, ratings are no biggie, just a way to get a good game. For some of us, um, well, me ... they are a crutch of our self-esteem and the better they are the better we sleep. It is a bit of a blow after going to the Manhattan club and having an elderly Japanese man tell you you are 2-dan to go on KGS and get creamed repeatedly by 9ks who seem not to need to think about their moves. I guess what I'd like is for the world to readjust to the old Japanese guy to make me feel better. Just an impartial suggestion for the benefit of all.
"Anchors" that keep ranks from reflecting the true strength difference between two people; sandbaggers
- Cheyenne: One of the frustrating things with the current ranking system is that I believe that there is a problem with "anchors" that keep ranks from reflecting the true strength difference between two people. Some of the factors that I think contribute to some rank freeze are:
- You have a small group of players who play against each other, they are all advancing fairly equally within the group. SO their ranks stay flat, however when they do on occasion play someone outside of their group, the game sometimes ends up very uneven.
- Sandbaggers who keep I believe also keep the overall rank system locked
Trickle-Down-Effekt: game result should influence the weaker player's rank more; regular realignment
- hgmichna: Make the known ranks of some strong players "trickle down".
- If some strong players with known ranks are used as anchors, then the rating system should make sure that the ranks "trickle down" the chain. This means that the result of a game between a stronger player and a weaker player should change the weaker player's rank more than the stronger player's rank. This would make sure that the ranks remain consistent. Since there are many more weaker players than stronger ones, the asymmetry would have to be quite drastic. An extreme method would be that the result of such a game has no effect on the stronger player's rank at all, but that would take it too far.
- The alternative would be global realignment of all ranks. This could be done weekly or monthly. Take all games that players within a certain rank range (for example, 3d to 1d) that were played against stronger players and find out the win/loss ratio. If more games were lost than won, lower the rank of all these players accordingly. Conversely, if more games were won than lost, raise the ranks of all these players. Shift all ranks below this range by the same amount. Now take the next lower range of ranks and repeat the same procedure. Adjustment within the rank range is not necessary, as the ranking system already evens out any discrepancies there.
Drift
Minimize rank drift during a hiatus
Hiatus shifts the rank
- meldroc?: I too would like to see ways to eliminate rank drift. After a month-long hiatus on my part, my rank drifted from 24k to 18k, now I have a difficult time getting games with players of my true skill level.
Credit only for win/loss against the rank of their opponent *at the time*
- Anonymous: Eliminate 'rank drift', give a person credit only for win/loss against the rank of their opponent *at the time*. If their opponent advances to 7p, they shouldn't get credit for beating a 7p person if they beat them when they were 30k. Instead weight the rated games based on how long ago they occurred, this also has the added benefit of preventing people from getting 'stuck' at a rank (where they played so much as their old rank that the systems doesn't recognize that they have progressed).
Allow a player to reset his/her rank to what it was prior to the drift.
- Anonymous: The fact that 'rank drift' occurs means that the rating system is flawed. Allow a player to reset his/her own rank to what it was prior to the drift.
Automatic Pairing (Automatch)
Give admins option to ban users using automatch
- Admins should be able to ban players who doesn't follow automatch rules (escaping too many times vs weaker players) and gives too many forfeit losses to opponent without playing move. They could still play normal custom ranked games without needed to deranked.
Rating: Separate ratings for different rulesets, time systems, different board sizes (REJECTED)
- ([automated pairing] ...). This requires separate ratings for different rulesets, of course, and I would suggest a "blitz", "timed", and "untimed" leagues for each board size. Note that this also handles requests for rated games on other board sizes. I know this is not really going to happen, but this is a wishlist , and statisticians agree with me :-)
- (wms in another part of the page:) REJECTED
Long time games available
- wysek: I'd like to have long time games available to choose and there should be some info, what "medium game speed" means.
- glue: there is some info, check the automatch help page, it explains there is a tooltip ;)
- wysek: thanks, didn't think of checking the hint/tooltip, but the first issue remains - I suggest changing 'medium' to 20mins and adding 'long' with 40mins. It's just a bit odd to have such a limited choice, I think.
Repeatedly the same opponent, option to discard a match
- Fishbulb: I've run into the problem of automatch pairing me with the same person repeatedly, and one player not wanting to play (I've been on both sides of this) and leaving instead. I know it supposedly doesn't count if you just resign since its less than 8 moves, but since white has to choose whether or not to save the game in the games-list, it stays in my history as a loss. I'd like to see an option to 'disband' or 'enull' match, so that there is no confusion in this scenario. +
Auto forfeit a game, either player plays within the first 2 minutes
- Cheyenne: Ran into a little problem.. had an automatch game start, but the opponent wasn't "there", didn't move (they had to move first), didn't respond to a "hello".. So... my suggestion/request... If either player doesn't do their first move within say 2 minutes there is an auto forfeit of the game. If someone is going to select automatch then walk away from their keyboard (or whatever) then they are just wasting their opponent's time.
Option in the automatch dialog to exclude escapers
- I'd like to have an option in the automatch dialog to exclude escapers. It is annoying to have your opponent automatically determined only to find he is an escaper at the end of the game. The possibility of automatching is an ideal possibility for escapers to find opponents, they couldn't find that easily without that option. So an option "Don't match with escapers" could help a great deal in preventing that method.
Automatch: add 'Rengo'
- Under 'Play Go | Auto match preferences | Game type' add 'Rengo'. +
Confirmation message "... opponent for you. Do you...?"
- Viltti: Before the automatch pairing there should be a confirmation message. (There is automatch opponent for you, do you wanna play? 'countdown 10, 9, 8, ...) If you or your opponent have just forgotten offer on there wouldn't be accidental games where opponent might even not be there. If one declines, system offers the declined another pairing when available and for the one who declines or doesn't answer, the automatch offer goes off! Anonymity would still remain, so you couldn't pick your opponent.
The "?" Ranks
Change the "9d?" to something like "9d+
- Hu: Change the "9d?" rating that gets awarded players who win a lot of games to something like "9d+". +++++-
- Harleqin: AFAIK the "9d?" is not different from other "xx?" ranks - these players have not lost enough games that they could be given a "secure" rank. If they want a "secure" rank, they should play against handicap.
Rewrite the ratings algorithm for "?" ranked players
- Rewrite the ratings algorithm for ? ranked players, or perhaps cap their ratings until the ? is gone in order to avoid the accelerated drift ? ranked people can experience.
Revert [xx?] ratings to [?] if their last rated game is unfinished
- Revert [xx?] ratings to [?] if their last rated game is unfinished.
Do not use a game against a xx? player to recompute one's rank
- Cheyenne: Do not use a game against a xx? player to recompute one's rank. As a rated player my rank should not be altered if I win or lose a game against a xx? player. If players know that their own rank will not be affected by playing an unknown I suspect they would be more willing to play such games (you would probably see fewer "no ?" requests). This will make KGS a more welcoming place for people who just joined and are trying to establish their initial rank.
- If this is already in place -- maybe having it alittle more visible in the online help.
- wms: Cheyenne, in general I cannot have a game affect one player but not another. It would make the whole rank system unstable. But why shouldn't playing a "?" player affect your rank? If your rank is solid, then it will affect your rank only a tiny bit, but even better, as the player's rank becomes known it affects your rank as if you had played their solid rank. For example, if you play a "4k?", and lose, your rank will change very very little at first because your rank will be much more confident than theirs; but if, a week later, this "4k?" has played (and won) a lot more, and becomes "1k", then your loss will have a greater affect, but it will affect you as a loss against a 1k, not as a loss against a 4k. So in general, there is no reason to treat "?" players special; the data returned has little affect when their rank is unknown, but when it becomes known, it is as valid as any other data. As for more info, the page in the help on the rating system says everything about the rank system - the algorithm is there.
- AlamBios?: A response to the comments of “wms” – While wms’ reasoning may be valid, the reality is that most, if not all, of the KGS players do not want to play with players of “?” ranking. I am not a computer programmer, so I cannot argue about the stability of the ranking system. However, from the point of view of simple logic, I cannot see why an “opponent dependent” ranking system cannot be established and be a stable one. For example, when a player’s (called the player “P”) opponent (called “O?”) is of “?” ranking, the system uses a “zero” value (or the like) for O? when calculating the game for P. When P’s opponent is of solid ranking, the system uses the existing calculation method. In this way, the “?” players will get their ranking if they play with solid rank players, and the solid rank players will not have to worry that their ranking will be adversely affected by playing with players of “unknown” strength. Wms may be right in saying that their concern is unnecessary, however, this is the reality. I do believe an “opponent dependent” ranking system will make KGS a much “more welcoming place” for players who would like to join the KGS family.
- Remillard - Let's just get rid of the question mark altogether. If something needs to be published, put a statistical correlation number on their user page.
Don't show temporary ranks ("xx?") (misleading)
- Don't show temporary ranks ("xx?"). They are most of the time wrong and misleading, in particuliar for beginners (sensen) -
"?" ratings being placed at the bottom of the rating list makes new players feel unwanted
- Ian Davis: I am fed up with the ? ratings being placed at the bottom of the rating list. This just makes new players feel unwanted. It makes it more difficult for them to get games. It encourages existing players to accept the notion that ? players are probably sandbaggers with wildly innacurate ranks. I'd like it abolished. ++
- Neil: But the ratings are wildly inaccurate. Not only can the initial data points be wrong, but the ? ratings often suffer from gross inflation. Besides, how will sorting probably-inaccurate ratings with the more trusted values solve the social problems associated with dishonest players?
- Of course initially the rank may be slightly inaccurate, that is neither here nor there. Demoting players to the bottom of the list does them a discourtesy, it hinders them getting a solid rank - which is just poor hospitality. Keeping the system this way maintains the impression that it is okay to choose not to play somebody with an unsolid rank because a they are too weak for you so the game is boring or b they are too strong for you and you will get crushed. Neither a nor b show any regard for the love of the game. If you can give me evidence that this creates a positive atmosphere on the server I will back down. At the minute though I am fed up it is still in place. To me it shows people are playing only to get stronger ratings.
- BrendenT: I agree that the ? could be considered "poor hospitality." I think we should not concern ourselves overly much with dishonest players. They will find a way to cheat no matter what we do. I think it would be ok to allow guests and new players to set a rank for themselves. This should be shown as tenative ? but allow the games to be sorted in their proper place. I personally check out the history of any new player I play (a very nice feature of KGS!) and so I'm pretty aware when someone might be sandbagging or inflating their rank. It's no big deal really if you are just a little proactive.
- tapir: I guess, I made the same proposal somewhere else. Their rank may be inaccurate (and everyone will see the ?) but why should they be sorted into a segregated category.
Include the "xxk?" players at their current rank
The same proposal as above.
- tapir: I don't want the "?" abolished, but I want the rank sorting (most important on Open Games) to include them at their current rank not at the bottom.
Add a < and a > flag to the rank
- Cheyenne: This might actually be more of a question -- but.. here goes.. Add a < and a > flag to the rank. A < indicates that the player has lost more then 50% of their correctly handicapped games, while a > indicates that they have won more then 50% of their games. The idea behind this is to maybe help identify overranked and underranked players.
- (anonymous:) In practice this is just going to produce another reason to discriminate against your opponent. Wow that guys on some winning streak, no way i'm accepting his challenge. Oh no he has a ? get out of here. Heh this game is losing like crazy, you'll do.
- PAG?: I like Cheyenne's idea. Why not having a "12k+" to indicate that a player is "12k?" but probably stronger, and "12k -" meaning "12k? but likely weaker". This would help adjust the handicap when trying to help newcomers getting a firm rank
- Can't say I understand this idea.
- Fuchsnoir: Consider someone playing 3 games against a solid 12K. If he wins all three games, you still don't know how strong he is, but it's probably stronger than 12k. So he gets a 12k+. If he loses all three games, he gets a 12k-. This is how I understand it at least. Enter the statisticians to tell me how wrong I am.
- Xela: I'm not a statistician, but here's how I see it: '12k+' is the same as '11k?' (or maybe '10k?' or similar), and '12k-' is approximately the same as '13k?'. The system already does what you want, so there's no need to change that part of it.
Rating: rated games up to 9 stones handicap?
- Any chance that you will let us play rated games up to handicap 9? Have you looked at whether they are statistically consistent with lower-handicap games like you did for ultra-blitz vs. normal time?
Get rid of the "?" or show it only in the game window
- kokiri: get rid of the "?", or at least make it only appear on the game window like the "~" symbol. The worst thing about KGS is the unwillingness of people to play against ?-rated players - presumably because it affects their rating less. Frankly this is pretty rude, and reflects badly on the otherwise very friendly atmosphere. When you start a new account it is practically impossible to get a game. Making it harder to see who is a ?-ranked player would stop this, if there's a better way of encouraging people to play ?ers then fine, but for now I can't think of one.
- xela I agree. I have played 57 ranked games so far on KGS (over a period of about 18 months)--but my rank still has a "?" because I don't play often enough (however often that is). It often takes 15 minutes or so to find someone of equal strength who is willing to play with me. I don't think that I am either a beginner or a sandbagger, and it is a very frustrating situation.
- Edward Hammerbeck I concur with xela. Same situation with me. I have played a lot of ranked games, but I have not played often enough to lose my ?. As xela says, I am neither a beginner nor a sandbagger, but who would want to play me because winning against me barely counts as a win. I agree that there should be some way to identify users without a statistically significant number of games with which to generate a reasonably accurate rank. I just disagree with there being a time element associated the algorithm that decides whether or not to assign a question mark. 2005.06.23
- Remillard I also agree. While I can see some point to the complaint of people not enjoying playing people of un-statistically sound strength, I can see a far huger problem of the ? accounts getting declined time after time after time after time. I would far prefer some sort of minor info on a user's page about the "solidity" of the strength value. If someone REALLY cares, they can take the time to look it up on their game opponent in the dialog box, but getting rid of the displayed ? would kill the knee-jerk response of declining all ? people. Another possibility, lowering the threshhold for ? so that it disappears faster. I don't mind that the correlation has some effect on the calculations for rank, but I do think that publishing the correlation as some sort of off/on threshhold is just not a good thing.
- samax? I want to weigh in against the "?" as well. I am in the same camp (and know other players who are there as well), and the "?" seems to be a totally negative feature - i.e. it only serves to make those of us who have one suffer, and does not really provide much for players w/o a "?". I have a friend (shauke on kgs) who usually plays 3 or 4 games per week, and if he skips a week, his "?" returns, and he again finds himself getting turned down for games.
"?": have bots the job done?
- Leira: Since it's relatively difficult for new players to get a rank (or even for weak ranked players to get a solid rank) because other players usually refrain from playing with them, why not having bots to get the job done? There could be a room where some bots (with some constant rank) were imperishably posting games that could be accepted only by people with non-solid ranks, or maybe the other way around.
The "~" Marks
Get rid of the ~ mark (tilde); "Nice-Guy"-mark
- Cheyenne: Get rid of the ~ tag (tilde) and replace it with a user rating defined ranking. When a game is finished, allow the players to optionally indicate if the game was helpful to them (put the question right on the same pop up that shows the "game done" and final score). Each player gets to make the selection and it is stored as part of the game record (doesn't have to be included in the SGF file). There would be two flags (one for each player). Then when processing one's rank, use the flag to determine how many games one has that the opponent marked as helpful. If a certain percentage of games are marked as helpful then give that person a "gold star" next to their name. +-
- wms: While this gold star system might be nice, I don't see it replacing the ~ because it is fundamentally different. If it were added, instead it would have to sit alongside the ~. The ~ was added because many strong players who would play weaker players complained that it was too hard for them to determine whether or not the weaker players they played were returning the favor (by playing yet weaker players). The gold star scheme seems more like a "who's a nice guy" thing, it isn't tied to playing weaker players, which was the reason for adding the ~.
- Cheyenne: Also keep the idea of having "gold star" as a separate request (apart from the ~ issue) +
- [2510] Sebastian: How about combining both ideas? Give special stars and lemons +:
- "gave me nice feedback" from weaker players; +
- "was polite" from any players; +
- "escaped" from any registered players, or so ... +
- "garrulous" - which would help address blubb's concern about silly chat in games[2130].
- mgoetze: Quoth KGS Plans: Add icons next to names in name list. What do those icons indicate? Many different ideas, not sure yet what ones will actually be there.
Do not include games against robots in the '~' calculation
- hboehm: Do not include games against robots in the '~' calculation. I play quite often against bots who are a few stones stronger than me - when playing against humans the ration of games between stronger players and weaker ones is quite okay. Nevertheless I have a '~' in my rank. +++
Display a metric on how close (or far), one is from getting the "~"
- bocephus: For those who are interested in avoiding the '~', give some mechanism to display a metric on how close (or far), one is from getting this award. Also, maybe technical detail in one place (i.e., free/rated, game size, [?]/[x?] players) on how the adjustments are made. [All game types and sizes apply and you can also reduce tilde scoring by playing newly registered players or provisionally ranked players.]
- Rakshasa: This sounds like a feature to make it easier for greedy players, why not also implement a escaper meter? ;)
If the game is not recorded (resign as first move), do not reduce "stigma counter"
- Bass: If the game is not recorded (resign as first move), do not reduce "stigma counter"
- Reuven: Could you explain?
- Bass: It has happened at least once, that a player with a "~" rank requests games against weaker players, not to play with them, but to resign without making a move. This should not help them get rid of the "~". (actually, I'm not sure it does..)
- tderz Losing games on purpose should also render these players weaker! Perhaps so much that they decline in rank and play on even with those, they - purportedly - avoided to play to begin with. Actually I was trying to find here the answer to the question 'How many games do I have to play, percentagewise, against weaker players, to get rid of the tilde '~' symbol?'. Do free games count as well?
No mark for playing behavior
- HonFu I don't like the ~ at all. It seems not right that players get a sign which marks their playing behavior, because it is becoming too much a tool for stigmatizing others. If it was that important to have a system which makes players play weaker opponents as well as stronger (I think it is, not only for rank accuracy), why then wouldn't we make it obligatory? Just insert a feature into the programming which doesn't allow you to play upwards, if you not make enough downwards-games first. The ~ could disappear, no stigmatizing anymore, which would add to the peace on the server, since chatrooms and infolines are full of "no ~s, and don't even ask" and similar requests.
- Wrenn I hope it is not bad taste to post a counter to this argument here, but I request that the tilde stays. I understand why people could get annoyed, but there is a inherent flaw in this. People who have tildes do not play weaker players. If a player is upset by having a tilde, this shows they have some negative reaction to being labeled as one who only plays weaker players. If one has received the tilde by accident, then they can work to remove it, and there is no problem. This tends to be someone who got it by bad luck and unfortunate statistics. The fact that they feel like having it is bad shows that they do not like excluding weaker players, and will have no trouble dealing with playing a few more weaker players a week. If someone has it, and doesn't care they have it, then there is no problem. It s just a marker of someone who wishes to improve quickly. If someone has it and complains, but does not change their behavior, then it seems that it is suited well for them. They have shown they do not like being labeled as "playing no weaker players", not because they think excluding all weaker players is bad, but they think the labeling is bad. I will get to that point later. If someone who has it does not care, then they realize they been making a conscious choice not to play weaker players, and they should not care that stronger players have the same right to consciously choose not to play them. There is a mutual understanding. When a player gets it, and is upset, and works o get rid of it, it has clued them in to an unfortunate pattern they happened to get into. If a player gets it, complains, and does not seek to change it, there is a high level of hypocrisy. They have been engaging in a behavior (playing only to get stronger by playing stronger players), and are denying others the same right (the stronger players they wish to play may in fact also wish only to better themselves, and will not deign to play anyone weaker). This is a form of rude behavior, and like other rude behaviors that exist on the forum (escaping, swearing, sandbagging), it should have a punishment that fits the crime (you lose rank for trying to escape rank loss, you get silenced by means of banning or booting for swearing, and you get your ranked turned off and/or booted for lying about your rank), and a punishment for deliberately shunning a group of people, while asking to be excluded from other's shuns seems like it should be to notify others of their behavior. That way, they get a taste of their own medicine. If they do not like being deliberately excluded for an arbitrary mark, then they understand that others feel the same way (the rank being an arbitrary mark when it comes to it), and maybe will not deliberately exclude those others. If they exclude others and expect to be excluded themselves there is no conflict, and if they exclude others by accident, then they will stop was soon as that handy little mark appears. In short, if you have it, you can either dislike it, or you can not care. If you do not care, there is no problem with it existing. If you do care, you can either seek to change it, or you can not seek to change it. If you seek to change it, there is no problem. If you do not seek to change it, it is a sign of deliberate and knowing exclusion of other players. They wish others to treat them in a way they refuse to treat others. The tilde tells others this, and allows those others to treat the offender the way he treats others.
Games with opponents, each set the other as 'Buddy', should not influence the "~"-calculation
- Hagios? I think that players with each other set as 'Buddy' should not contribute either way towards having a ~. I have a very close friend with whom I play often online as well as in person. I am presently rated 1 stone stronger than him on KGS and therefore, even in spite of the fact that we both tutor weaker players, I tend to cause him to gain a ~ from our games together.
Players should be able to check if they have a "~"
- Sampi Players should be able to check if they have a ~, because there is no easy way to check this yourself. +
Throw away the "~" without replacement
- Harleqin: Throw away the "~" without replacement. It is just a useless stigma, and players paranoid about the behaviour of their opponents can get the right impression better from the games list. +
User Info: Rating Graph
Each game in the rating graph as little dot
(a part of this discussion was commented out '%')
- (Sebastian:) Display each lost and won game (or only rated ones) in the rating graph as little dots. The height is a function of the rating of the opponent, handicap and komi. Color codes win/loss (e.g. White = won, red = lost). This would show at a quick glance if the rating has been earned actively or passively (see mgoetze's example "if you only play one rated game" above) ++++
- (%) Someone changed this so that it didn't make sense anymore. You can't color a day for lost or won - you only can do that with games. Also, I proposed deliberately that the dots should be small, and offset from the graph - there's plenty of black space around.
- (%) TJ: This isn't possible/feasible. The x axis is by time, in units of days. That means, every 24 hours, your current rank is plotted as a point and a line is drawn between it and the last point. It could be possible for the x axis to be games, but that would make for a different sort of graph entirely; I don't think such a graph would be as useful, since it would do little to show your progress over time, but would rather show the internal workings of the ratings system and weights of won/lost games. Perhaps the OTHER suggestion was that days in which you win/lose at least one game be plotted differently than days in which you didn't play?
- (%) ethanb: I think his idea wasn't to replace the rank graph, but to add additional info to it in the form of a dot graph plotted alongside the line that's already there. It would effectively combine the rank graph and games list into one chart.
- (%) TJ: Pretty much impossible given problems of scale. A chart alongside or contained within the current chart would retain a day as the unit of the x-axis. Given the largest possible size the graph is likely to be shown in, how many pixels width your monitor has is how many games you could possibly fit between one day and another one; the big problem is that the smallest you limit the graph to (and you'd have to limit it) would dictate the number of pixel/games you could show per day. This problem expands when you consider a person may play many more than one game every hour, each game requiring its own space between day and day to be plotted within. Visualize it (read ahead!), it's not do-able.:)
- (%) ethanb: Why separate them lengthwise? It's not like you'd need to know what hour a certain game was played at. Just show the dots at the same x-coordinate as the daily update. All you're looking for from this is the distribution of opponent's ranks and whether you won or lost against them.
- (%) cheyenne: Just to throw in my 2 cents worth... one should not become that obsessed with one's rank. A point here, a point there. At a kyu level, your game playing isn't really steady on a game by game basis. Ever notice how close dan games are (when they are played out the whole way), most of the time it's within only a few stones. Ever notice by how much a kyu player can win or lose a game by, 30, 40 points one day 2 or 3 the next and back to 30 or 40 again? -- but anyway Personally I would just focus on my play, and ask myself -- am I enjoying the games that I am playing? The rank graph should be used to view one's long term progress and not as a measure of how well I slept the night before ... end of my 2 cents ...
Rank graph: different display of questionable rank (with "?")
- (Sebastian:) Display rank curve differently where questionable (rank with question mark). (This doesn't have to be as fancy as in some other servers which show error margins. Just using a different color should suffice.) +++
Rank graph: a line showing rating adjustments
- Dan: The rating system has gone through enough upheavals that it's a little hard to tell when someone's rating has changed through their results or through a rating system redesign. Perhaps on the rating graph the line could be made discontinuous at the points in time at which the rating system itself changes. Then you'd see a line segment, another line segment 1 rank higher, and another line segment 1 rank lower (for example), instead of one line that jumps up and down.
Limit the maximum rank of players to 9d
- KoReNJe?: Limit the maximum rank of players to 9d, so instead seeing some 10d+ on the graph, I'd like the server to lower ranks of the players that played with that 9d. so the maximum rank would be 10d on the graph :D
User info: rank graph truncated; viewer decides period of the graph
- Bjoern: Ahhhhhh!!!! Where did the old rank graphs go? I would love to see the rank graph not to be erasing after 13 months... A rank graph is a historical document that should last for the future... +
- Bjoern: Let the viewer decide in the settings or at the web page with radio buttons, for how long he wants to see the rank graph. E.g. a month, 3 months, 6 months, 1 year, 2, 3, 5, 10 years +
User info: rank field in decimal form (e.g. 12.8 kyu)
- Jared: Rank field in User Info should be in decimal form. Alternatively, both decimal form and truncated form could be displayed.
- To me this just shows an unhealthy obsession with ranks. This will do nothing to address the number of people on KGS wishing to play opponents within a strict grade boundary. I can see no obvious benefit from such a change. WMS has already caved into enough moans over ratings.
- A compromise is, showing your own decimal form but you can only see the opponent's truncated form.
User info: see the most recent bit of my rank graph
- Raist?: I'd love to see the most recent bit of my rank graph - its always hard to see the most recent week and the trend if any. It would be useful for the rank graph to show an extra month with no data, so its easier to look at the data. Also - it should be pretty easy to calculate the number of wins needed (at the same level) to progress to the next rank, or atleast to see the recent win loss ratio - to help get a sense of recent trajectory.
User info: average time spent per move in rated games
- Hu: I'd like to see in User Infos an average time spent per move in rated games. This would be make it easy to distinguish those who have earned their rating by blitz and ultrablitz from those who have been more thoughtful. The average time is easily computed if the database remembers the number of moves played and the time spent moving. The server can easily track the time since it keeps the time for both players.
- uxs: Why would that be useful?
- BlueWyvern: I don't particularly like this idea. I never play under blitz settings, but I almost always play at a fairly brisk pace, especially if my opponent is playing extra slow and I have already read out a response to their move before they play it. The speed I play at is frankly my prerogative, and if my opponent is satisfied with the time settings, I don't think it's anyone's business how fast I play.
- Reuven: I play both.. It can be a problem for those who play blitz mostly and a couple of really long games - Getting blitz games'd become impossible for them.
User info, statistics: win/lose; rated/all games
- caraoke?: I would like to see total win/lose numbers with options to filter rated/all games and their percentages.
Option for giving control
- Erwin?: I would like to be able to give (manage) the control to (among) every kibitzer.
want see my all my friends ..
- Erwin?: which games are they watching, playing and I want challenge a player directly! Also would like statistics! Both has been standard on IGS from beginning.
Averaging function
- Would be nice to have a possibility to see ranking graph averaged. It would be easier to see slight changes in overall play strength in long term.
Other Board Sizes
Allow rated games of any size (REJECTED)
Let 27k (and weaker) players play 13x13 and 9x9 rated
- BramGo: Is it possible to let 27k(and weaker) players play 13x13 and 9x9 RANKED? This would encourage beginners to learn the basic tactics first, instead of hopping to 19x19 right away. (Since beginners only know about tactics not about Strategy or Wholeboard judgement, it may probably be even more reliable to base ranking on 13x13 games.) On top of that a lot of beginners asked me before: "why can't we play 9x9 ranked?" I think we shouldn't pressure them to play 19x19 if they are not ready for it. Of course I understand that we can't allow all players (for instance 10k players) to play 9x9 ranked. But for absolute beginners (27k+), why not?
A different rating for various board sizes and time limits
- Anonymous: A different rating for various board sizes and time limits might be interesting. Eg. a bliz ratings, a 9x9 rating, etc...
Game Result Weighting
- blubb: (restored, because not obsolete at all) Weight ratings according to the average time per move rather than to the total playing time of a game, and use a continuous function to do so.
- wms: This simply makes no statistical sense, blubb. Either games are slow enough to predict the strength of a player in "normal" speed games, or they aren't. If they are, they should be counted. If they aren't, they shouldn't. I just don't see why I would add inaccurate information to the rating system at any weight. Making it weighted less doesn't make it any more accurate, just possibly less damaging - but leaving it out completely will be even less damaging, so that's what I intend to do.
- blubb: I agree for the case that your sample is infinite. Then an element (that is, a game) either contributes useful data to the evaluation (correlation being positive) and can be included with full weight. Or it doesn't contribute useful data and should not get any weight at all (zero or negative correlation). However, ratings are calculated from a finite set of results. I don't think this is the right place to go into details of theory here, and I am not quite familiar with english prob&stat vocabulary either. So, I'll give an (artificial) example ...
└─► (Topic to be discussed at KGS Issue - Game Result Weighting)
Different ratings for normal and blitz games
- kevinwm: I have seen some chess sites provide different ratings for normal games and blitz games. The idea is, when the time is limited enough, the game becomes inherently different. I don't have strong feelings on this - it's just an idea.
Winning margin, "even" result
- There is a rule of thumb that basically states that if the game is "even" according to the handicaps, then the end score should be within 10 points (in either direction). (Yes I know that for weaker kyu players this will not always hold true, but as one gets stronger it does). So in determining the ranking, when taking a look at the rated games, factor in the score difference. If someone is consistently winning by more then 10 points, or if most of the wins are by the opponent resigning, then the player is probably stronger then they are ranked. If they are consistently losing by more then 10 points or if most of the loses are by them resigning, then they are probably weaker then they are ranked.
- DrStraw: This absolutely not true. If I am 20 points ahead I will ease off so as to ensure the win, thus winning by only 10 points maybe. On the other hand if it is a close game I will make every effort to find a winning sequence and may end up winning by more that 10 points. Are you telling me that I should be penalized for playing so well that I can take it easy towards the end? As a general rule, only weaker players win by very large margins (yes, I know there are exceptions). Perhaps we should say that a win by more than 20 points counts for less as the player clearly did not have enough feel for the game that he eased off towards the end so as to guarantee a win. Of course, this does not make sense, but it makes as much sense as counting the winning margin when computing ratings.
- Cheyenne: If you would play the same person say 3 or 4 times and each time you beat that person by a large margin (say +10 points) I would say that you are stronger than that person. In playing that person you should probably give them an extra stone. I think that that type of information should be somehow captured in the rating algorithm.
- Mef: There are several problems with trying to take into account score for ratings. The first is the one DrStraw mentioned. Or how about the reverse of that situation, instead of realizing you're ahead by a lot, you count the board and realize you are behind by a little. So instead of accepting the loss you take a risk and it fails, so you lose by 20 points instead of playing it safe and taking the 4 point loss, should you be punished for taking a risk? Or how about TheCaptain's games, where his games are more likely to be won by one liberty instead of one point. The score might be a difference of 30 points or more, but anyone who watched the game can see the players are of comparable strength. Not to mention winning games by resignation would be impossible to factor in for this, since some players will resign if they see they are more than a few points behind going into yose whereas others will continue to play if the opponent has killed most of the board...
- nachtrabe: I'm not even sure that the theory about it ending close is true for strong players/pros in general. How many professional games end by resignation? Of those, how many of them are within ten points versus more than ten points? What about games between players like Takemiya and Cho Chikun, where the entire game can come down to whether Cho can find a way to live inside of Takemiya's moyo. This is further complicated by how would you factor it into the rating calculator: Are you going to punish someone for resigning when they are five points behind and see no way to catch up? Or are you going to let someone who is 200 points behind resign in order to avoid the penalty? One or the other has to be allowed.
Bots and Ranking
Make bots unable to play ranked games
- Blake: Make bots unable to play ranked games. I understand the logic behind it--it allows bots' strength to be measured against human competition--but it skews the ranking system, which should be primarily to allow players to find human opponents. Bots don't play well. They also don't play in a human way. They have exploitable weaknesses which can allow someone to beat them simply by knowing their weaknesses--inflating the person's rank and distorting the ranking pool around the bot levels.
- Perhaps the most elegant solution to this problem would be to implement a mechanism by which bots are the only party rated when a bot plays a game, so that a new gametype arises--B, to go along with T/F/R. The bot's rank would be adjusted by the people it defeats and who defeat it, but the humans' ranks would not be affected.
- Phelan: I understand your idea, but I don't think it would work... I think the reason most people play bots is so they can get a ranking, if they don't have one... The bot makers wouldn't mind, since they would still get the ratings, but how about the beginners and question marked players? they would then have almost no incentive to play bots. And if no-one(or almost) wants to play bots, then it wouldn't be as good for the bot makers as well... If your solution was to be implemented, perhaps it should only reduce the weighting given to bot games.
- Blake: Playing bots doesn't give people a proper ranking appropriate for playing human players. It gives them a ranking only appropriate for playing bots. If that is the only argument, then why not let a player set a ? rank when he first creates his account? For example, log in, set your account to 10k? (or whatever), and go on about your business. It's just as meaningful. As for the bot-makers, well--I understand that people playing them is valuable data, but the purpose of a go server is primarily to enable people to play go, and that concern should be paramount.
- Phelan: While it's true that playing bots to get a rank doesn't give a meaningful one, it's also true that many people won't play unranked([?] and xxk?) players. By removing the weighting for those games altogether, it would make it much harder for unranked players to get a rank. After they get their bot rank, they can quite happily be beaten by properly ranked players until they themselves have a proper rank, so the system is currently working.
- nando: I agree with Blake to a certain extent, there's an issue here. But the proposed solution seems a bit excessive in my opinion. I recently discussed a proposal of mine with Jyem on KGS, but he discarded it, not because he had a strong opinion about it, but because he was almost positive that wms would immediately reject it.
- The idea in a few words: normal accounts cannot get a rated game with a bot as soon as they have a stable rank (no ?) and played more than a certain ratio (I was thinking of 25% for a starter, can be tuned later if necessary) of their games (rated or not) with bots. Ranked bots are getting way enough rated games anyway, so that wouldn't be a problem for bot programmers (like myself), they would still have their measure instrument. Bots would keep being useful "ranking machines" and it would certainly help preventing issues like the skewing (or even abuse) of the ranking system, the meaningless (and pathetic) pool of GNU Go players (you know, those guys who play exclusively with bots), etc.
- blubb: Due to daftbot (aka DrunkenGnu), I have contributed a bit to the situation Blake is complaining about, so I'd like to answer this, for what it's worth. I suppose that the potentially flawed ranks gained from bots still serve as a better estimate than arbitrary self-assigned ranks ever would. Towards true newbies, the defects of algorithmic play usually are not all that obvious. More experienced players who keep on exploiting bot weaknesses with fresh accounts in order to achieve an "impressing" nominal rank, might then simply assign the ranks (or rank estimates) of their dreams to themselves, which probably would be even farther off.
Bots introduce some extra circular dominance to the one that is already there (between humans), but I think that is nothing the rating system, as a whole, couldn't deal with. Human players who stick to one or a few opponents raise a similar issue. Solidly rated bot-beaters who play 100% with the type of bot they are familiar with, are less of a burden for other human players' rank accuracy than bot-beaters who also frequently play humans (say, 75%, as suggested above). In my view, the problem boils down to the unfortunate fact that bots don't report reverse-sandbagging opponents who deliberately lose rated games.
- Anonymous: If the rating system worked properly, robots wouldn't inflate the ratings of humans. If the rating system worked properly, a robot would have a low rank, since it was easy to exploit. If a human beat a robot with a low rank, then his rank couldn't be inflated, because he was expected to win.
- Anonymous: I take the opposite stance as Blake, above. Double the number of bots available to play. It's currently too hard to grab a bot for a new game. If a human player exploits a weakness in a bot, then the bot's ranking should plummet. Allow no time control for playing against a bot. Encourage beginners to get a permanent ranking based on games against bots, provided that bots have true rankings.
Blake's opening statement about bots implies that we admit bots on the server to help the bots get their ranking. We should also make the bots help human players to get their ranking.
Rating: Miscellaneous
General aims for the rating system
- RobertJasiek: Set general aims to be fulfilled by the rating system like "more wins than losses of newly played games means a rating increment", "the rating shift for winning/losing a game is public before, during, and after each game", etc.
Rating quality
Confidence Error
- Anonymous: A more sophisticated rating system. In particular, use variable (i.e. per user) standard deviation in the ratings model, so a user would have a rating and a 'consistency' (e.g. 1.2d +/- 0.5, or 18.5k +/- 5.0... show maybe 2 standard deviations after the +/-). This could also replace non-solid ratings (e.g. '14k?'... when to apply it and what it means).
Undo: option to change game type to "free"; automatic change
- Beolach?: I'd like to suggest that when one player in a rated game requests an undo, the other player has in addition to the two options to allow or deny the undo a third option to allow the undo, but change the game from a rated game to a free game. Another, stricter option people might want to discuss would be to make it automatically change rated games to free anytime there's an undo.
Statistics: count of players of rank x at time y
- It would be nice to see statistics about how many people of a particular rank are playing at any given time. Maybe a graph? ++++
Help files: show rank comparison (eg: AGA - BGA - KGS - etc.)
- Show equivalent ranks in other systems, eg: AGA, BGA, etc, as well as KGS rank. -?
- mgoetze: I don't believe there is a simple formula for this, and if it were implemented it would be more misleading than anything else. +
Make games against bots not count towards ~ calculation
- Dan: Make games against bots not count towards ~ calculation.
Different levels for promotion and demotion:
Resignation: ask the winning player whether his opponent was equal in strength or not
- Tempus? How about on option on resignation, asking the winning player whether his opponent was equal in strength or not equal (a.k.a. much weaker). I realize there is a lot of potential for abuse in this, and some people may complain at clicking an extra button, but perhaps it could help equalize the rankings somewhat. This might also be useful when dealing with escapers who leave because they're losing very very badly, since their rank would decrease faster from the poor ratings. That's a touch mean, though. +
- tapir: This may be a good idea for allowing ? players to get a rank.
Allow a beginner to reset his/her own ranking
- Allow a beginner to reset his/her own ranking, when the estimated rank is out of whack, or where the beginner's rank has been inflated due to drift.
Ranking for Newcomers
Don't give an estimated rank to someone who has never won a game (? DONE ?)
- (Marathon:) Can the system be changed to not give an estimated rank to someone who has never won a game? I recently played someone who was rated 14k?, not realizing how far off a rank with a question mark could be. It turned out he was a lot weaker than 14K. In fact, he was a beginner. The game didn't go well -- he resigned, and hasn't been on KGS since. He might never visit KGS again. I looked at his history, and he had no wins. How does the system justify giving even a rank with a question mark to someone who has not even a single win?
Until this is fixed, we need to be aware of the problem, and try to avoid it. In my case, it was a free game. So, I could have (and should have) stopped the game, and talked with my opponent.
What could be done if it were a ranked game? Should the stronger player tell the other that he/she should resign, and then have a talk? Should there be a means that allow both players to agree to abandon a ranked game (without leaving an "escape" mark for either player) if one of the players is new and has an uncertain rank? (Marathon, July 31, 2009)
- tapir: Afaik, this is not the case (anymore). The solution is of course to talk to ? players before starting the game.