RobertJasiek: "SOS is one of the few tie breakers recommended by the of both the American Go Association (ext 1) and the European Go Federation (ext 2). The EGF prefers the SOS-1 and SOS-2 variants,": Making statements like "the AGA recommends" or "the EGF recommends" is very strong. But what are the facts? In case of the AGA, we have a webpage on its site. Is this the opinion of the AGA or only of the page's author? In case of the EGF, there is at least an official tournament ruleset. However, the context is cited arbitrarily: What the EGF prefers the most is not cited. When I cited it, it was deleted. Is the purpose of the parent page to misinform the readers? There is not need to make strong statements with improper citations. Why not be neutral like: "Some associations or federations have SOS or variants thereof among their recommended tiebreakers." This would be correct, not overly strong, and relevant for the page's topic.
Herman Hiddema: The AGA list this as the "The Official AGA Tournament Guide", and reders back to it in the official policies and procedures for running AGA tournaments (PDF) The preference of the EGF not to break ties has been given a place of prominence with the "when you absolutely have to break ties" linking bakc to an explanation that this is not necessary. The only EGF preference not cited then is Direct Comparison, which I have moved up to the top on the tie breaker page. The AGA does not recommend direct comparison over SOS. Also, there is no need to mention every other tie breaker on every specific tie breaker page. The purpose of the page is to explain to the average reader what SOS is, without going into too much detail or mathematical specifics. The current version is not overly strong, IMO.
RobertJasiek: I think we are approaching each other. Maybe my latest edits for greater neutrality and less details find also your agreement?
Herman Hiddema: Yes, I think we are approaching each other. I have made further edits. I have reintroduced the "if you absolutely have to" phrasing, because both the EGF and the AGA specifically recommend against breaking ties. I think stressing this with somewhat stronger wording is appropriate. I have also changed the wording to "organisations like the EGF and AGA", because this gives the reader an indication of the size/importance of the organisations in question. I do not know whether any programs specifically advise for or against SOS, many seem to be content to provide lots of options (eg PyTD provides 11 tie breaking options), but many do use SOS as their default option, which I have therefore mentioned. I have also added the purpose of SOS to the introduction, because this is important information for those curious about what SOS is. I have added a disclaimer that it is vulnerable to noise (and may therefore not always fulfill its purpose)
PS: I have unindented your reply, I think it improves readability if we stay at different levels of indentation, giving a direct visual indication of who said what.
RobertJasiek: The WME by Bass contains a lot of opinions not shared by everybody. Before everybody agrees on a neutral point of view, the varying opinions still need to be discussed and synthesized. E.g., I disagree already with the first statement "is a good primary tiebreaker". In my opinion, it is a bad tiebreaker (although there are yet much worse tiebreakers like SODOS).
More information can be found in the EGF Tournament System Rules.
xela: There has already been a lot of arguing about SOS; it seems that it's a topic on which it's impossible to reach a unanimous opinion. It seems to me that presenting a list of arguments for and arguments against, without advocating any of those arguments, is a reasonable compromise. I agree that the sentence "SOS is a good primary tiebreaker" is controversial and does not belong on the main page. However, I'd like to see the rest of the lists below restored to their position on the main page.
RobertJasiek: It is possible to reach a NPOV by listing both advantages and disadvantages. However, in that case prejudiced arguments should be avoided or every single argument would have to be listed in every prejudiced subview. Of course, controversial general statements should be avoided at all in the main page. Furthermore, a list of arguments should not be arbitrarily selective (and thus prejudiced) but include all important arguments. If the WME list of arguments were to be restored, I would also want my earlier list of arguments be restored at the main page and would want to add counter-arguments for the WME arguments. E.g., "SOS reflects the strength of opposition rather well" is met by "SOS reflects the strength of opposition rather badly" (or better yet: reasons are supplied as well). Such an editing of the main page would necessarily make it too long and not really useful. So I think that creating a real NPOV would be by far better. E.g., "SOS is a tiebreaker. In particular, if some tiebreakers are used at all, it could be the primary tiebreaker. Etc." Wikipedia is a good model for how NPOVs look like.
Bass: NPOV does not mean "leave out facts that someone could possibly consider opinions". One can replace the "good tiebreaker" with "the best tiebreaker" or with "the least bad tiebreaker available until someone finally implements MLES", depending on just how unclear one wants to be. The bit about not using tie breakers at all was already in my original text.
Also, if you argue that SOS does not reflect the strength of opposition exactly as well as MMS reflects the strength of a player, then you are simply mistaken.
RobertJasiek: "could consider opinions": Let me examine your text inhowfar it contains opinions (rather than something that somebody might misinterpret as opinions):
"replacing good by the best": You argue as if SOS were - among the tiebreakers - necessarily the best one (provided we use some at all). Not so. There are other tiebreakers worthy of consideration: Like Iterative Direct Comparison or like Board Points. It is not at all a priory clear why SOS should be the best (apart from the absolute question whether it is on the good or bad end amoung tiebreakers). Saying "the best" instead of "good" has two problems: 1) It contains opinion at all. 2) It creates an essentially unprovable statement. Saying "good" has "only" (1) as a problem.
"you are simply mistaken": We are not here to exchange opinions but facts and reasons. We get nowhere if I reply: "no, you are".
"if you argue that SOS does not reflect the strength of opposition exactly as well as MMS reflects the strength of a player": If I assume that with "the strength of opposition" you mean "the SOS of a player", then your phrase becomes: "if you argue that SOS does not reflect the SOS of a player..." This self-reference, however, is meaningless. To reply, I need to know what exactly you want to express with "the strength of opposition". Please define it!
Dieter: Robert, that is not fair! (without defining fair) We are discussing here whether the clearly defined and measurable SOS is a good representation of the intuitive and immeasurable concept of "strength". Unless you want to hear:
SOS represents scores of opponents, by adding them. Whether SOS properly represents strength, depends on the pairing system. In Swiss pairing it will be widely off mark. In MacMahon it will be closer, since "score" there involves "set rank" which itself is a representation of strength, again debatable but especially in the higher ranks supported by rating systems with proven consistency.
RobertJasiek: Now you make an understatement; in your previous editing you have listed more aspects we are discussing. Besides my previous reply was about yet another aspect: Identification of opinion. Of course, we can discuss every of the aspects (and others) and then add the conclusion for every aspect on the parent page. But I think we should maintain some structure in our discussion, or it will lead nowhere.
Herman Hiddema: I think you can argue that SOS is a good tiebreaker, because:
- It is recommended by the EGF
- It is recommended by the AGA ( http://www.usgo.org/resources/howtd.html#tiebreak)
- It is widely used, an indication that tournament directors find it a good tiebreaker.
I also think that "strength of opposition" is not an ambiguous term, I think there would be little confusion among players what "strength" and "opposition" mean is terms of go tournament. Also, I think that SOS does indeed reflect strength of opposition quite well, especially in McMahon tournaments. It is not perfect ofcourse, but no tiebreaker is.
- That there are specific advantages does not make a tiebreaker "good" yet. One has to compare them with the disadvantages. Weighting then forms personal opinions.
- Being recommanded by a federation or association is an advantage, but one should look more closely how and why the recommendation is given: The EGF Tournament System Rules specify this a bit more specifically (see there) but do not make an absolute statement "good tiebreaker". At least the EGF qualifies it as a tiebreaker of that its usage is valid in contrast to specifically not recommended tiebreakers.
- I see many tournament directors (of not so big and important tournaments) that use things (like pairing programs with their default settings) without or without much thinking, if only it works somehow (i.e., a pairing is made and players are sorted at all). This is not a proof of quality of SOS! Becoming aware of SOS and its background requires a learning curve. I remember the days when I was in the 5 - 15k kyu range: Then for me SOS were just some numbers, it was funny to look at them, but I had no idea what it was. The above average tournament director will be able to explain the SOS definition - but knowing or even evaluating its characteristica and comparison with other possible tiebreakers (or sharing places) is another level.
- "strength of opposition" has been used with different meanings for years on rec.games.go. Rating, rank, tournament score, current tournament score, or others.
- Why do you think that SOS reflects strength of opposition quite well?
- Well, what would you define as "good"?
- I think there is very little discussion that the preferred method of breaking ties is to play more rounds or playoffs. Most tournaments, however, do not have the time for this. So usually, more rounds/playoffs are not under consideration as tiebreakers due to external constraints. The other tiebreakers listed in the EGF Tournament System Rules are:
- Direct comparison
- Number of board wins (only applicable in team tournaments)
- SOS, SOS-1, SOS-2
- Rating/Previous order (related concepts, rating is a possible previous order)
Direct comparison is nice, but is in my experience rarely useful for bands large than 2 players (with 3 players, circular wins happen regularly and with 4 or more, they have rarely all played each other. So for individual tournaments, if you are going to break ties (which we assume you are, otherwise discussing SOS would be pointless), the primary EGF recommendation that is useful in all cases, is SOS or a variant.
- (renumbering to line up with numbers of RJ's post)
- Many inexperienced tournament directors do indeed use SOS because it is the 'default' tiebreaker in their pairing program. But this in itself does not disqualify it from being good, does it? In fact, many experienced tournament directors also use it.
- There are certainly multiple valid definitions of "sterngth of opposition", but in the context of a McMahon tournament, the McMahon Score of an opponent is actually a combination of of both prior rank and tournament performance, and is generally a reasonably accurate assesment of a player's skill. If it were not, we shouldn't be using the McMahon system. And if it is, then SOS reflects "strength of opposition".
- An example: In a 5 round McMahon tournament with lower bar at 20 kyu, the SOS of a 1 dan could be 110. The SOS of a 5 kyu could be 90. I think there is little doubt that 90 SOS reflects the fact that the 5 kyu had significantly weaker opponents than the 1 dan. As such, SOS reflects "strenght of opposition". This example is not useful in the context of tie breaking, but it does show that generally, higher SOS is related to stronger opponents. The question is whether a 1 point difference in SOS is still significant. I think the significance in such a case is low, but not non-existant. I would say that it has a higher chance of being reliable than random lottery, and the EGF Tournament System Rules agree, it seems.
- A "good" tiebreaker (under the assumption that some shall be used) is significant (in the mathematical sense), contains only useful information (no noise), and restrieves all the available information of its own nature. (This is a rash description, maybe I could improve it with more time.)
- In MacMahon / Swiss, Direct Comparison can be applied only in some cases but when distrinction of the first player is the aim, then exactly 2 tied players is not scarce.
- Applicable in all cases I do not equate with useful in all cases, as you suggest.
- "strength": One just has to be very careful to know exactly which type one is speaking of.
- The EGF tournament rules have not precisely defined the degree of significance of SOS. For a more detailed theoretical discussion of significance, we need maths. Rash arguments like "higher chance of being more reliable than random lottery" won't do.
unkx80: Lets try this. I think Bass's so-called opinions hides a lot of facts, so perhaps these can be used.
RobertJasiek: The degree of neutrality of your text goes towards the right direction. I do not agree on all factual aspects of your rephrased statements though. (If necessary, I can correct that later.)
Bass: To Robert: I'll try to keep this as short as possible, and list only the points where I think we disagree:
And of course we should not forget:
which I'd like to transform into a challenge: please give a single (one only!) example of a better tie breaker that can be used in a perfectly ordinary one-weekend McMahon tournament in February 2008.
If you are not comfortable with having "better tie breaker" mean "a tie breaker that correctly guesses the result of a rematch more often", you are very welcome to supply your own definition.
Bass: Regarding points 4,5 and 6.
Let's make a hypothesis that it is possible to construct a tournament result so that two players are tied with the best MMS, SOS chooses player 1 as the winner and direct comparison chooses player 2.
Then, if those results would show that player 1 were actually more likely to score better were the tournament further continued, then wouldn't you agree that in this case SOS is a better tie breaker than a coin toss (which has a 50% chance of getting it right), which in turn is better than direct comparison?
Because constructing tournament results is such a boring task, let's use the results from a perfectly ordinary 5 round McMahon tournament (yes, it took a while to find such a clear-cut example.)
I cannot give you the exact likelihoods because I lack the software, but it will take very, very much convincing before I believe that Terwey Matthias's score (which, in addition to the "all-crucial" direct encounter win, includes a loss to a player with a lower MMS and two games against players who have scored 3 MMS worse) indicates a "better ability to collect MMS" than Lu Ji's wins against all the players at the next best MMS and a loss only to a player with the top score.
- Would I want to consider "more likely scores as better in a hypothetically continued tournament if the more likely is given due to information collected (also) before the tournament"? No, because I think that, once a tournament has started and the players are set into its system, only the results shown during the tournament should matter. Otherwise we should not call the thing "tournament" but "end of a series of evaluated events". For the latter, one could (if one trusts in its meaning and precision, what I do not do) define a deadline and award prizes to players simply in order of their rating at that very moment. (There are yet more complicated methods like introducing weights depending on time and / or number of following games, but such then enables "everybody" to become the winner if only the weights fit just him well accidentally.)
- Would I want to consider "more likely scores as better in a hypothetically continued tournament if the more likely is given due to information collected during the tournament"? Yes (if ties need to be broken at all). Several tiebreakers fall into this category, e.g., direct comparison and SOS. (This is not a sufficient criterion for a tiebreaker to be good though.)
- Is SOS better than Coin for the purpose of assessing more likely scores as better in a hypothetically continued tournament if the more likely is given due to information collected during the tournament? Yes, if this is considered on average for an infinite number of tournament instances modelled for the same date. (Compare the Law of Great Numbers.) No, practically speaking, because one never has an infinite number of tournaments, not to mention on the same date with the same players.
- Is Coin better than direct comparison? Yes, if being applicable the more often is the only criterion of evaluation. No, if the degree of meaning when both tiebreakers are applicable is the only criterion of evaluation. (I would not want to suggest to search for greater meaning in averages of Coin compared to averages of direct comparison, when also weighing its applicability, because such a weight is chosen and interpreted somewhat arbitrarily necessarily. Counting frequencies of applicability is a weight but does not weigh degree of meaning yet, so finding a more appropriate weight becomes guesswork.)
- How to evaluate Tervey's achievement in the example tournament depends on the aims of evaluation. Implicitly you suggest aims that favour SOS. But why don't you also discuss the disadvantages of SOS in all detail? If the aim is to approach the knockout system model by means of MacMahon, then direct comparison is suitable because neither it nor the aim continue to consider the players already eliminated from the consideration of winning the tournament (because of having won too few games and so got too low MMS). If the aim is to evaluate each player's achievement in a context of everybody else's achievements, then direct comparison, SOS, and other tiebreakers are candidates for a tiebreaker to be used (but a lot of advantages and disadvantages will have to be studied, compared, and evaluated).
Bass: Your comment point 5 ends just a little bit early. the last sentence should read: "advantages and disadvantages will have to be studied, compared, evaluated, the study accepted by peers, and then conclusions can be drawn". The first 3 phases are already done (the summary of this study regarding SOS is in the bit you removed from the SOS page as an opinion), we are at the peer review phase, and since I am an amateur at this kind of collaboration, I already advanced to the conclusion phase without waiting for comments. Sorry about that. On point 4: I do not think many tournament organisers will want to include more meaning to their tie breaker, if that meaning sometimes is "reward the player who is most likely to lose".
unkx80: Regarding point 6. Typically our side will organize a one weekend tournament with 7 or 8 games, Swiss pairing. Prizes are sometimes given to the top 3 to 6 players, and I have seen as many as 10 prizes given out in the kids group (to make the kids happy). The typical distribution for 7 games is that, you have 0 or 1 players getting all 7 points, 2 or 3 players getting 6 points, and then a bunch of players getting 5 points. I see this kind of distribution for WAGC as well. Therefore, tie-breaking is always necessary. Following common practice, SOS followed by SOSOS/SODOS is used, but I am not going to argue whether this is a "good" tie-breaker.
RobertJasiek: Prizes need tiebreakers only if the prizes are indivisible. In Germany it is common to issue top prizes (often money) for the top places and access to a book table for choosing 1 book for the field players with many wins. Mostly the order of choosing a book hardly matters; it is very much like having divisible prizes.
SOS is a good primary tiebreaker. Use it when you must break ties but do not have the possibility af arranging a rematch.
Arguments for using SOS as a tie breaker:
Arguments against using SOS as a tie breaker:
RobertJasiek 2003-10-04: For a single player, greater SOS for him than smaller SOS for him could be interpreted as greater strength of his opponents during the tournament. For any two players, a meaningful comparison is hardly possible because it is unclear
Some argue that SOS would be fair on average over many tournaments but this is refuted by the law of great numbers. It requires an infinite number of tournaments to allow that conclusion while no player ever can play an infinite number of tournaments. Even worse, specific titles are issued only once per year, tournament conditions and a player's development change.
To summarize, SOS can be used to help a program to make its pairings but SOS ought never to be used as a tie-breaker in the final tournament results ordering, where it behaves like a random variable.
I agree with some of Robert's points but not all of them. It is good if the organiser uses SOS when pairing to attempt to make sure that all players on the same score have equally difficult opponents. IF this happens then SOS is less useful as a tiebreaker. However, this is rarely done in my experience. SOS does not behave like a random variable. It is never worse than a random variable for resolving ties and sometimes much better.
SOS is a rough measure of how strong your opponents have been. Normally if you lose early in a tournament you get to play others who have also lost early, who will typically turn out to be weaker (and finish with fewer points) than others who win their first few games. Losing early is sometimes known as "the Swiss gambit" as without tiebreakers it can be advantageous to lose early, play weaker opponents and reach the final rounds fresher than other players who have won consistently and thus played harder opponents.
In general, SOS performs poorly at the extreme ends of a tournament. It is close to random in deciding first place. In the middle of the tournament it works quite well at ordering players who have finished on the same number of wins.
Normally there is no need to order players in the middle of a tournament, and tiebreakers are most needed to decide first place. :(
RobertJasiek 2003-10-08: We agree that SOS can be used as a means to assist a program to make pairings during the tournament, although there is no consensus which pairing strategy should be considered the best. - You claim that SOS does not behave like a random variable. I disagree. It behaves differently from coin tossing, sure. However, there is no general description yet that would explain that difference, there are only empirical tests. With more empirical tests observations could differ. To get a general statement about the difference between SOS and coin tossing, one needs more than empirical tests: One needs probability theory that is not only some theory but also explains what one observes. - You claim that SOS is never worse than a random variable. I think you mean "coin tossing" as a concept for a random variable. I disagree: Suppose a 1 round Swiss tournament with 2 participants A and B of known equal strength and the game result A beats B. Then we always, i.e. also on average over many tournaments, have SOS(A) = 0 and SOS(B) = 1. For coin tossing it could be either Coin(A) = 1 and Coin(B) = 0 or Coin(A) = 0 and Coin(B) = 1, each with 50% probability. On average we get Coin(A) = 0.5 and Coin(B) = 0.5. Since we assumed equal strengths of both players, the tie-breaker Coin is much better than SOS in this tournament. Hence SOS can be much worse than coin tossing. - Concerning the early rounds of a tournament, one cannot assess a tie-breaker fairly since it is random how strong every player's opponent (among the pool of available opponents) is. The WAGC is a good (since extreme) example of that: There are, say, 3 very strong participants (Korean, China, Japan) and many weaker participants from intermediate to very weak. The 3 very strong players should win all their games until they meet each other. For the games before, SOS is not fair since very weak opponents will make fewer wins than intermediate opponents. Even a modified SOS is not fair (only slightly better on average); e.g., SOS-1 that throws out the weakest opponent or SOS-round1 that throws out the first round. Such does not solve the principle problem that even after eliminating extremes the remaining probabilistic expectations for one's opponents are not the same. - You claim that tiebreakers were the most needed to decide the first place. For which purposes? To distinguish final results that cannot be distinguished sufficiently meaningfully? The number of wins determines the strongest player, then tiebreakers make some of them luckier by giving titles, prizes, and honour only to them while Go was supposed to be a game of greater skill and not of greater luck.
LordOfPi: It seems there is a little mistake in that argumentation. If player A and player B are of equal strenght then the average SOS over many such tournaments will converge to 0.5 for both because both will win half of the games and lose half of them.
RobertJasiek: You are referring to my trivial example tournament? There is not a mistake in it. The tournament is defined so that it is player A that wins. (Maybe we could say that player A knows how to beat player B's style, even though generally they are of equal strength.) We can invent another example: Like that example, but the win is given due to a 50% winning probability of the player A in each one-game tournament. Then after an infinite number of such tournaments, SOS converges to 0.5 for each player. With a still finite number of such tournaments, however, often either player A or player B would win more. Say, player A wins more. Then SOS(A)<=0.5 and SOS(B)>=0.5. The tie-breaker Coin is not the game result coin but is still 0.5 for each player; it is thrown only as a tournament final result list tie-breaker. Interpret that type of tournament as you like...
tderz: Robert, common sense tells me that it makes no sense to discuss the best tie-breaker (SOS or coin) here, as there is none - A won ("game result A beats B").
kokiri i find it slightly odd agreeing with Robert Jasiek, but here goes anyway: I can understand why people would think of using this as a tie breaker, but is there any real substance behind it?
mgoetze: What do you mean, substance? The idea is simply that when two people achieve the same result, the one who did it despite tougher opponents deserves a slightly better placing. But this is an idea and not a substance...
kokiri...so does SOS tell you who beat the tougher opponents? really? I admit that I haven't thought it through completely, but I have my doubts. I'll give it some thought and get back to you...
mgoetze: No, it tells you who played against the tougher opponents. If you want to know who beat tougher opponents, you're looking for SODOS. However, SOS is usually prefered because most people also consider a loss to be somewhat worse if it was against a weaker player. ("Tougher" in this case is defined as "having done better in the tournament".)
kokiri sorry, yes, played. I'm still not convince that, although intuitively it seems reasonable, this isn't a big red herring. Who would you rather play, a 5 dan who's lost 5 games vs 5 dans, or a 1 kyu who's beaten 5 1 kyus? SOS would rate the latter as a harder opponant, no?
IanDavis For SOS to be used as a tiebreaker it must show who has played on average the more successful opponent. If A and B tie and A has played the person who finished 12th 4th and 13th whilst B has played 4th 5th and 6th it should show B to have done better. On face value it seems to be doing this to me.
Dieter: There are several simultaneous arguments going on here:
I think the discussion could benefit from clarifying your standpoint and give arguments for these separate questions. Here are mine:
RobertJasiek: Dieter, thank you for structuring some of the issues a bit! Which is more appropriate? This has to be answered afresh for each basic tournament system, purpose of tiebreaker usage, and aims to be achieved by the tiebreaker. Does SOS achieve what it claims? No. I have given some reasons above and on rec.games.go, when calculating in general the parts of SOS that reflect a) earlier achievements of one's opponents versus b) later achievements of one's opponents after having played them. Strength versus achievement? Unfortunately, during recent years "strength" has often been used where "achievement" was meant. My phrase "tournament strength" was also only a tiny bit better (by discriminating against "long-term, general playing strength" at all); "achievement" is to be preferred. OTOH, "achievement" is a bit of a common word and could easily be confused with its meaning in ordinary usage.
IanDavis: At the last French Championship they compared SOS, SODOS and Performance Rating, they found correlation. This is then some evidence that SOS is a good tiebreaker.
further discussion deleted--Ian was asked for the (approximate) value of the correlation or more details or references, but didn't have the information available at the time.