How To Align Rating Systems

Reflame: I think there is an easy and obvious way to fix the differences among rating systems - am I wrong that this would work fine?

What I suggest is that each rating system (AGA, European, IGS, KGS etc.) would have a mechanism how to stay close to other systems. If for example an average player with rank between 10-15 kyu in AGA would be stronger that a player with the same rank in most other systems, then AGA could try to shift the ranks in the correct direction: whenever someone in this range in AGA loses a rated game, his rank will be moved down a bit less, say 80 percent of what is normal. And when he wins, his rank will be moved a bit more than usual (say 120 percent).

Example: If 12 kyu AGA corresponds to 10 kyu European, and if you need for example 4 lost games with 13k to get from 11k to 13k, then with this 20 percent correction you will need 5 lost games (instead of 4) to become a 13 kyu, because you are in the interval where your associations knows that players are underrated and so it tries to increase the rank of players. If it were conversely and 10 kyu AGA would correspond 12 kyu in most others, then AGA would try to decrease the ranks of all players in this interval. (Maybe 120 percent is very much and 101 percent would do...)

If there are too few real-world matches between people with various rating systems (for example AGA and Europen), it can be solven nicely through internet servers: every KGS (IGS etc.) user will be allowed to manualy fill in the real-world rank (for example: I am 8 kyu be in AGA), and the go server will be able to display a statistics. Therefore European Association will have an easy way to compare whether people from (for example) AGA. It seems clear to me that all technical problems with this are easy to solve.

I think both inet servers and real-world associatons should do this - and if one or two start, the other will soon do it as well. Or does it trouble players so little that it is not worth the labour?

DrStraw: This argument breaks down immediately when you realize that a rank within the AGA is not even consistent. Different parts of the country have local tournaments and may be consistent within themselves but there are few tournaments which attract people from all regions and so it is hard to gain uniformity. Imagine how this will work if you try to apply it to all the different national organizations. Plus, whose system do you try to standardize on?

Velobici: When the Western Go Ladder was in progress it became evident that West Coast AGA ranks are several stones higher than East Coast AGA ranks in the 10k - 6k band. Unfortunately, the results of the 14 rounds that took place are no longer available on the Internet.

Malweth: So... "How to become Shodan:" Travel from East Coast to West Coast to gain 2-3 stones rank? I really have to get to congress one of these years!

Velobici: So it seems. In 2004 the West Coast folks seemed to have noticably different AGA ranks. Will see if that remains true this summer at the 2006 US Go Congress.

IanDavis Note that the European Ranks have the same problem, but this is compounded by national organisation using different promotion systems. We even have GoR which should help align all the Ranks.

Alex: Asking all the different Go associations in the world to calibrate their rankings is like asking every country to adopt Esperanto as its official language to facilitate international communication. Sounds like a great idea in theory... in practice, simply not going to happen.

Velobici: Hmmm...national languages are intimately connected to national literatures and define sets of associations based upon word choice...consider sultry in English with its associations regarding the weather, women and sex. But ratings are not that important to how a nation defines its Go ranks compared to its language. The European Go Database is already accomplishing this task among the European countries based upon the GoR scores assigned various people. From the EGD, we can see that France, Germany and the United Kingdom all match fairly well. Indeed, one might argue that the EGD Statistics page shows that ranks have been unifed across Europe for all countries with more than a couple hundred players.

Hyperpapeterie:Yes, changing a national language creates greater difficulties, but it also has greater benefits: the ease of communication provided by a universal language would be much greater than the benefits provided by a uniform go ranking system. So I think it is somewhat the same problem, merely on a smaller scale. There are a great many associations with particular ranks: how seriously do people take being Shodan vs. being 1kyu? It marks a very important milestone in people's development, and changing systems would shift those milestones. You might argue that systems are arbitrary and change over time, both of which are true, but there's a difference between an unintended change which happens slowly (i.e. rank inflation) and a change mandated from above, just as there's a difference between the evolution of a language and the imposition of a language. Finally, I think the proposal underestimates the extent to which it would be a point of contention whose rankings were used as a baseline. How would you go about telling the Japanese/Chinese that their ranks are wrong and that they must adopt the other's?
Velobici: Please differentiate between rank and rating. Ranks would be defined by organization. There is no need for the Chinese, Japanese, Koreans or the AGA to change their ranks. 1 kyu players remain 1 kyu. Shodan players remain shodan. Ratings would be numerically assigned based upon games played against other rated players. The numerically assigned ratings do not have to include, and probably should not include, the words kyu or dan...just digits. Thereby allowing for a worldwide system of ratings, just as there is in chess, while retaining the existing national systems of ranks. (Please the the GoR column of the this table for an example from the EGD.)
Hyperpapeterie: Of course they are distinct, but I find it hard to imagine that establishing an official, uniform rating system would have no implications regarding rank, or even that it is not intended to have implications for rank. Perhaps the most productive thing for me to do is ask you, or anyone else who favors this idea, exactly what the system is supposed to accomplish if it is not meant to have any effect on rank?
Velobici: What is a uniform rating system supposed to accomplish ? Whenever one travels and plays go/weiqi/baduk, confusion immediately ensues regarding the appropriate handicap. Currently, a member of the AGA in Europe finds himself playing with too few handicap stones, while in Japan he is playing with too many. This is due to the attempt to use rank in place of rating. While traveling go players are more common than was the case 40 years ago, most often people from different areas play each other on the Internet using of the go servers. When a person visits a different go server, the same problem arises as if they had travelled to a different continent...finding the appropriate handicap or appropriate set of opponents. If a particular organization wishes to have 1 dan players with the same rating as 5 dan players somewhere else, well, that's their privilege and I am indifferent to the declared or assigned rank.
I find the question as to what is the purpose of a uniform rating system to be baffling...Yoda Norimoto is a 9p, but so this Yi Changho. Yoda has a positive record against Yi. Which is stronger? Rui Naiwei also has a positive record against Yi. How do the three compare. But Yi is not necessary the benchmark player...Gu Li seems quite strong, is he perhaps stronger than all three? This chart gives me a better answer than "Rui is stronger...naw, Yoda is...What? Yi stronger than both!" Of course, I also want confidence intervals on those ratings :). Much of human activity is measured and the measurements are used to compare abilities. George Sheehan once said when asked "How are you doing, George? I don't know...let me go run first, then I'll know." One measures performance against one's own previous results as well as against others. While I will never beat Yi in a game, I believe that I can improve. Measuring and comparing that improvement with previous results is a basic activity of all competition. Why wouldn't one want a universal system ? Is it preferable that each locality define its own system of measures (ratings) in go ? If so, what about time ? Get rid of timezones. What about distance ? Measure the local leader's arm. No need for temperature scales. Protocols and standards are an essential part of non-local communication. Your question is strange to me.

Ed?:Asking all the members to also agree,and use their 'correct' rank is even less likely..., nevertheless I belive the bulk result would be of benefit - I guess the 'strongest' player in each system cannot be >10 Dan as a natural bound ?

AndreasTeckentrup: The strongest player cannot be stronger than 9d by unwritten agreements. Actually, the differences between the strongest Amateur ranks are very much like the differences between the rating systems: in Aga rating and Japanese club ranks, it is possible to reach 9d, while in Europe and Korea only the strongest of Amateurs can reach 7d. In Europe I play as 1d, and in Japan as 4d, but the handicap I receive from the strongest available players is the same, 5 stones.

RayTomes I disagree with Alex that it is like trying to adopt esperanto. There is no big learning time involved with go handicaps as with languages. However people have made valid points about inconsistencies with ratings even within a country. It is true that adjusting all the ranking systems of countries is a futile way to go. But internet go servers really can have consistency internally. The whole world does interact. So the way to go is to try and get the various servers to move their rating systems towards each other and the countries will eventually follow suit.

The big issue then is keeping the ratings stable over long periods of time. I am not convinced that anyone has achieved this to date. The most promising way is to recognize that there are some go barriers that people tend to pause at on their way up, such as 10k and the more noticeable 5k barrier. I found that I was able to follow these barriers over many years in New Zealand go ratings and adjust for the slight drifts in ratings. This can probably be done as an automatic process if ratings of active players are made into a histogram and compared to past histograms, aligning the peaks (barriers) by mathematical means.

I have given a lot of thought to rating systems over many years (I am old!) and I see some flaws in KGS rating system at times. Ratings memory should be based on games played not on time interval. If a player gets in 100 games in a month then last month is no longer relevant. If a player only has 2 or 3 games a month then many months are needed for a proper rating. Understanding the variation of win probability with both handicap-error and handicap level is necessary (an extra stone handicap at 8d is a huge effect, but not much at 10k). If someone wants to write code for a decent rating system, then I can offer the correct calculations.

If any common amateur scale is adopted, then the strongest players around the world will be well above 9 dan, so that should not be a consideration. The alternative is to go to very tough ranks like Korean ones.

Tapir: The notion that you can be a master / black belt when some people still can beat you on 9 stones is strange. There should not be 10 dan, or 9 / 8 dan players among amateurs.

How To Align Rating Systems last edited by tapir on January 19, 2012 - 13:30

RecentChanges · StartingPoints · About

Edit page ·Search · Related · Page info · Latest diff