Bill: The simplicity of the concept is attractive. People have gotten upset when they have won a game, only to see their rating decline. As we have seen with various online ratings systems, mathematical sophistication has produced neither stability nor reliability. Besides, even simple ratings systems tend to be self-correcting.
The details can be improved. The points at stake for winning or losing are unrealistic. More importantly, it should be harder for a rating to increase, the higher it is. In the New Mexico ratings system of the 1970s, for instance, the range for each rank (half rank, actually) increased exponentially as it got higher.
Andreas Teckentrup Is the author of this page actually in contact with WMS, the creator of KGS, or is this just a suggestion put here? Please clarify. Implementing this could also mean a lot of work and well, its his go server.
Harleqin: I agree. The page should be renamed to LobosProposalForKGSRating? or even better, moved to a discussion thread. The proposed system is also seriously flawed, in that it does not cope with deflation due to strength increase of the players.
Bill: I agree that the title is misleading and should be changed.
As for deflation, my experience with the New Mexico ratings system indicates that it is not a serious problem. It took 2 years for our ratings to get about 1/2 stone stronger than the ratings in California. Nobody complained about the 1/2 stone promotion. ;-)
Online servers, such as KGS, where pro strength players play sufficient games, can easily adjust for deflation. For instance, at regular intervals the ratings can be adjusted so that the average of the top few active ratings is at or above a constant. Because of deflation, these adjustments will be upwards, which is more psychologically appealing than downward adjustments.
One reason that deflation was not a real problem, despite the fact that, as a group, our players were young and improving, was the smaller ranges between low level ranks. Even spacing of ranks, as in lobo's proposal, makes it too hard to advance if wins and losses from games are constant.
Looks possible on paper, but have you tested it to see if it actually works in practice? I am happy enough with the current system on KGS. Unless I can be shown that any other works demonstrably better I don't want to change. -Ian
- Andy Pierce: - On the timescale of use of internet servers (a few years), I think the case can be made that nobody ever gets weaker. People learn, they read, they gain experience, they get stronger. So, why do you lose points when you lose a game? A related issue is the kind of scenario where a player maybe has an extra glass of wine, and loses a couple of games, and goes "on tilt" as the poker people say, and goes on to lose a whole stack of blitz games all at the same sitting. This person's rating might drop a whole stone or more and it will take what seems like forever for it to go back up. Particularly if the scale for rank advancement is exponential, how about using a system similar to the Nihon Ki-in new promotion system where to be promoted to 5k you simply need to win some number of even strength games against other 5ks? If you have too much wine and drop a pile of games, you don't advance, but it doesn't hurt you either. A factor could be added in such that you have to win against a wide diversity of 5k players too, so that you don't get as many points for beating up on the same person (or program) over and over again. Such as system would also take care of the Fear of Losing problem and people could just get on with enjoying playing their games again.
Bill: IMX with the New Mexico system, it worked quite well, especially once we corrected for the observed deflation with regard to California. The system was designed so that the players could figure their own ratings without relying upon computers. It was robust, easy to explain, and psychologically acceptable. You win, you advance, you lose, you go back. Nothing mysterious.
I have observed various complaints about online ratings systems over the years. Some are psychological, as when your rating does not track your results. More serious, in my mind, are the instabilities that show up from time to time, when ratings are drastically redone. Also serious is the general failure of ratings to predict proper handicaps. I think that this is the result of devising rating systems solely to predict the results of even games.
In regard to the Nihon Kiin system, it is similar to what they had before, and what associated amateur clubs, such as the one in Honolulu used. It is also similar to what the American Contract Bridge League uses. There are two problems associated with ratings that only increase with wins. First, wins do not gain as much as when you can lose points. That actually slows down the advancement of rapidly improving players. Second, rank inflation is rampant. The ACBL has had to repeatedly introduce new requirements for advancement to keep their ranks meaningful. As for go, how many modern 9-dans are good enough to have been 8-dans 100 years ago? 50 years ago?
RobFerguson: People DO get weaker. I play go while I'm stressed out about work, and I might take a break from go from time to time and get weaker after the adjustment. Regardless, the KGS system is seriously flawed...
I've won 12 games in a row before and not been promoted. In general its designed to make losing especially harsh. As a result, people tend to have multiple accounts on KGS or turn their rank off and the system becomes seriously FUBAR. With the way the system currently works when you turn off your rank it removes your games from the rating pool!
The proposal is fairly similar to the TOM ranking method. Unfortunately, with WMS's new job I doubt he'll be interested in trying to change the ranking system, but having a discussion about it and proposing a complete system is a way to get attention. The only people who like the KGS ranking system now are those who play infrequently or abuse it.
Hopefully the squeaky wheel will be greased :P
xela: Actually, those of us who play infrequently don't all like the current system. If you haven't played a lot of games then your rating tends to fluctuate widely and unpredictably. Fo r me this is the worst property of the current system.
I don't understand why go players generally seem resistant to some variation of the Elo Rating idea. It's easy to implement, easy to understand (users can predict how their rating will be affected by their results) and seems to be working just fine for chess players as far as I know.
Velobici: Elo developed his rating system before computer were available for ratings calculations. Elo made a number of approximations in his system to simplify the calculations required. At the time, one could make an argument in favor of these simplifications as necessary trade offs: accuracy vs tractibility. Such is no longer the case...at this time Elo contains unnecessary inaccuracies.
Its worthwhile restating Elo's comments on rating systems:
Often people who are not familiar with the nature and limitations of statistical methods tend to expect too much of the rating system. Ratings provide merely a comparison of performances, no more and no less. The measurement of the performance of an individual is always made relative to the performance of his competitors and both the performance of the player and of his opponents are subject to much the same random fluctutations. The measurement of the rating of an individual might well be compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yard stick tied to a rope and which is swaying in the wind.
xela: Yes, I agree with Elo's comments. Ratings are never going to be entirely accurate, especially for amateur go players playing on a server. That is why I would prefer an imperfect but simple system to something theoretically correct that gives unpredictable results. (Besides, I don't believe that a theoretically correct system actually exists--but that would be a whole separate discussion!)
Bill: The idea that go strength can be reduced to a single number is a fiction. The idea of an accurate rating is also a fiction. The ability to calculate ratings with more precision than before may be worth nothing.
Flower: Aye, it is quite impossible to reduce skill to a single number within a rating system :) I like the Bayesian approach of rating systems like Glicko, as they try to assign a interval instead of a singular rating. Especially the addition of the 'Volatility' seems to be promising regarding detection of sudden skillchanges or keeping the 'Rating Deviation' high for people with significant skill fluctuations :)
Uberdude: All of the numbers are multiples of 10, so you should divide tham all by 10 to make it simpler.
Cyrion? : Interesting new ranking methode, however the system for new players seems to have some problems (if i understood it well). What happens if a 9D in real life comes and plays againts a 30k ? Is he considered a 30 k and therefore wins 1000 points by beeting the other 30K ? If that is the case that means that by winning all his 10 first game he will only achieve a rank of about 20K . After that he will keep on steeling everyones points on his path before he reaches 9D, and that will take a lot of time (more than 100 games i guess). It might be that i did not understand the system, if so i apolagise. However if i did i would advice that when you regester as a new player, you choose your level like on IGS and get the corresponding amount of points.
tapir: Would not all be fine with the KGS ratings if players feeling seriously misrated would have the option to switch into the ?-mode as if they have not played for some time? Correcting their rating (hopefully upwards) but not deflating someone else in the meantime.
gryn: Not to sound like an idiot, but could you please give examples of using the chart. It's rather confusing to me, I can interpet it several ways. One simple question you could answer, is, are all games (except perhaps new user games) and equal exchange? That is, the chart tells us a single number, and the 2 players are playing in order to exchange that amount amonst themselves? Or do you find a value for yourself, and a value for your opponent, and you win or lose your value, and he his?
lobo: First of all I'd like to point that I'm not in any way 'backed up' by WMS on this. This page is just my proposed alternative that I wanted to polish in the coming week and then present it to broader public. I now see that checking 'minor edit' wasn't enough to make it hidden before your eyes;) But now that we're all here, let's try to modify it together by constructive discussion. I'll try address some issues you've pointed out:
If name change is in order please do so, I'm fairly new to SL and don't know all policies & rules so a little guidance from somebody more experienced wouldn't hurt.
Rank deflation: As somebody pointed out it shouldn't be a problem and if we left things as they are now we could address this by periodic 'correction', however originally I intended to have a maximum amount of points a player can gather. Thus a 9d that wins lots of games stops to 'advance' and in return he pushes those who loose to him down the ladder. Thanks to fairly quick (can't be too fast or ranks will change on a daily basis) propagation of promotions/demotions such strong players (KGS has handful of those) would be a measure to sequence the rest. This brings us to another problem that could arise.
Lack of points: Having a closed population of players that have 'fixed' amount of points and allow promotions based on taking those points from others would be a disaster. Progress of one would mean a demotion of others and we all know that doesn't exactly reflect reality. But on a Go server our population isn't closed, we gain members constantly and those people are our source of points, also thanks to multiplication 1 new member injects enough points to cover for a couple of promotions. That would of course create inflation but that topic has already been addressed.
Theory vs Practice: No I didn't test it, and in fact I don't think it can be tested without a very advanced computer model that would take way more time and resources then actually implementing it. There are however multiple go servers that have a points based system and they seem to be doing well. Actually if something so wrong like current KGS system can 'work' and make people actually say things like 'I am happy enough with the current system on KGS' then my confidence in this proposal is quite big:)
Same amount of points for win/loss: I don't have all the answers and if along the way we figure that needs to be changed we can do so, still maintaining simplicity. However I know this, people do get weaker, either its because they use KGS around screaming children, they had a bad day or they just try something new. New system doesn't try to save the world & make us all happy, its main purpose is to reflect your CURRENT playing ability in a best possible way, nothing more.
Nihon Kii-in system: Bill pointed its flaws already I'd only like to point out that what every ranking system tries to achieve has basically same principle: you win with 5d you are 6d, you still win with 6d you are 7d. What differs those systems is that we have downward regulation of ranks and Japanese system is more of a 'I feel like a 3k today, I'll try to achieve this rank on the exam'. Multiply this attitude by couple of times and we have an army of wannabe 3k that are in fact 6k but if they play each other for the right of being 3k it will look like they really are, when in fact they're not.
Elo: I don't know the system but I agree that this is just an approximation, once again I don't want to save the world with this and make the-only-correct-ranking-system-ever. I just want to play go with people that have a similar strength as me.
Uberdude: they are multiplied by 10 for the sole purpose of getting rid of fractions in Mother of all tables. If we divide them by 10 we have to do so with points achieved to maintain the same progress pace.
Example for gryn: Right now I'm biased towards winner gets +X points looser gets -X points (X being the number stated in the table).
Implementation: If what you say is true and WMS can't be bothered to code this I could try to do it myself, but I would have to know what the interface for current KGS ranking functions looks like.
PetriP?: This adding and subtracting points just does not work. It like ELO but just less accurate. So first simulate it with some kind of population with known win/loss probabilities. It is in use in some go servers and usually ranks on those servers give weaker hint on opponents strength tahn KGS rank.
Nick?: The existing KGS system is not broken. It is not perfect, but nor is it broken. A great deal more knowledge, thought and experience has gone into the existing KGS system than into this "new proposal". In my opinion, this "new proposal" is not really worth anybody's time. (BTW I am not connected with KGS except as a relatively new user, but I have taken the time to understand a few ranking systems, including Elo, and have observed the problems they have met with over a period of about 30 years).