Direct Comparison (iterative or otherwise) does not mix with MacMahon or Swiss scoring.
This is because the Direct Comparison breaks ties by emphasizing arbitrary data points, which are known to be likely measurement errors in these systems.
In Swiss and MacMahon every player is assigned a score, which is accumulated by winning. If players A and B are tied for the first place, then from their identical score it immediately follows that if A has won against B, then A has lost against a player with a lower score.
This means that some kind of loop has occurred: for the sake of simplicity, let's assume that B has won against this other player, call him C. (If there is a tie and tied players have played each other, there will always be some kind of a loop involved, though it can be very indirect or even an abstract one. For disproving the tie breaker's compatibility, we only need one incompatible case though.)
So, we have the situation A>B, B>C and C>A. How is this a measurement error? Well, according to these tournament systems, every player's measured skill is representable by their score. Now, this score is a number, so A>B>C>A must contain a measurement error, because numbers do not go into such loops. (Strictly speaking, you could find another interpretation where A>B>C>A would not be a measurement error: you only have to believe that such loops can exist. But then you should not use the Swiss or MacMahon (or heaven forbid, the knock-out) systems, because they measure only a tiny fraction of all the possible games. Combining this with not believing that A>B and B>C implies A>C, you would find that the results show nothing but artifacts of the pairing.)
In this situation (A and B tied, A>B, B>C, C>A) the Direct Comparison rewards A>B, and completely ignores B>C and C>A. This is a basic statistical error of counting arbitrary data points twice, which is further aggravated by choosing these data point among those that are known to be involved in a measurement error, as shown above.
Similar argument applies to every situation where the Direct Comparison algorithms would break a tie.
Bass, 2009-12-30
This is a convincingly correct and well reasoned argument, entirely unlike the passage Recommended Usage of Tiebreakers contained in the EGF Rules.
What you call the "EGF Rules" does not exist under that name. There are the EGF rulesets "EGF General Tournament Rules" and "EGF Tournament System Rules". The latter speak of DC and about recommended tiebreakers. The purpose of that ruleset is not to be a research study report. Therefore it is good that it does not include research etc. Your criticsm of research study missing in it is improper. Research studies should be made outside, e.g., here on SL.
Yes I wrote EGF Rules instead of EGF Tournament System Rules. However the passage I mention includes something that looks very much like research. I am glad that you agree that this should not be there.
Sigh. I do not agree that what is in it should not be in it. I say that the ruleset is not a research study. BTW, have you noticed that that ruleset, which you criticise so much, also says the following?
"This general order of priority is recommended:
1. Playing more rounds. 2. Playing playoff rounds, possibly with short thinking times. 3. Having equal players. 4. Using tiebreakers."
I.e., the ruleset acknowledges that tiebreaking is something undesirable better avoided.
Yes, I read the page. I agree that it is better to have more rounds. I do not agree with the original and erroneous research contained in the rules.
I have now asked for this paragraph to be removed.
Whom?
The rules contain implications of earlier research / study but they do not contain the research itself.
The A>B>C>A loops are known for winning statistics, ratings, SOS, - as you describe now - direct comparison and probably for quite some other tiebreakers, too. One can also see them in a first criterion of various tournament systems like Swiss, McMahon, KO, Round-Robin. If one wants to avoid strength / performance loops, then one has to stop all go players' playing...! IOW, it is not a problem of a particular tiebreaker, not even a problem of only tiebreakers - rather it is a problem occurring in any competitive game or sport.
You are right that certain data points are measured twice while other data points are ignored. You are wrong that the selection of those data points were arbitrary; quite contrarily the selection is well-defined to include exactly the mutually tied players' games.
The measurement of occurring A>B>C>A loops is not an error but reality: Those loops do exist!
What is or is not a basic statistical error depends on how one defines "error".
That numbers of an ordered set of numbers are used as possible values for tournament placement criteria does not require us to overinterpret any such criterion (DC, as in your discussion example) as if we were allowed to imply non-existence of strength / performance loops. That different tiebreakers order players differently also reminds us not to overinterpret tiebreakers. (E.g., ROS versus IROS or SODOS versus SOLOS create different orders.) All directly or indirectly win-related tiebreakers like SOS or Direct Comparison rely on the assumption that a player's won games are more important than his lost games. These are some of the reasons why I recommend not using tiebreakers over using them (here I do not include more rounds among the tiebreakers).
Why do you call Direct Comparison inconsistent with Swiss or McMahon? That it has the same loop ignorance as Number of Wins Score or McMahon Score lets it be consistent rather than inconsistent. (Consistency does not make the loop aspect adorable though.)
We can consider tournaments measurements of wins despite the loop aspect and other aspects (like different players' different stanima under different thinking times, round start times, room temperatures etc.) we'd rather avoid if we could.
The measurement of occurring A>B>C>A loops is not an error but reality: Those loops do exist!
..
Why do you call Direct Comparison inconsistent with Swiss or McMahon?
Measured loops most certainly do exist. However, the crux of the matter is if you believe loops to actually exist in whatever you are measuring. If you do, then you should never use a swiss type system to decide a winner for tournament: If you believe that A>B combined with B>C says nothing about A>C, then you should measure a much greater percentage of possible pairings than these systems do. Using swiss would be much like using a (badly randomized) monte carlo algorithm and then only running it for five rounds or so.
If, on the other hand, you do believe that maybe A>B and B>C do make A>C a little more plausible, then you can use the swiss type systems, or even the Knock Out. But then you should not use Direct Comparison as a tie breaker, because that always involves a loop, which goes against the required assumption. (Also, one wonders why the A>B part of the loop should be statistically more important than the B>C>A part, but luckily that is irrelevant here)
I am not saying that Direct Comparison is a bad tie breaker, or that the MacMahon is a bad tournament system. What I am saying is that you should never use the two together.
Bass, 2009-12-30
Assumptions is indeed one of the weak points of tournament system theory. I postulated that several years ago but very little has changed. Much study to be done. Currently I (or anybody) could not tell you a reasonably complete set of assumptions under which Swiss/McMahon + Direct Comparison would be "good"/"bad" for the loop problem.
I am not sure if I understand your comment. Do you mean that no reasonable set of assumptions is known so that the Swiss/MM + DC combo would make sense?
What I have been trying to say is that such a set of assumptions is extremely unlikely to exist at all, which is the problem with recommending DC as a tie breaker to use in all kinds of tournaments, as the EGF does.
-Bass, 2009-12-30
For no tournament system (with or without tiebreakers), its assumptions have been denoted reasonably clearly and completely yet.
Robert, please just remove the DC from the general EGF recommendations.
-Bass, 2009-12-31
The other, less recommended tiebreakers are: SOS-2, SOS-1, SOS, Rating, Previous Order, Lottery. DC is better than any of them. So the recommendation is good. Of course, DC in itself is not perfect. No tiebreaker is. But the recommendation is a relative one in comparison to the other tiebreakers.
We know the (dis)advantages of DC and of SOS-like tiebreakers pretty well, although each of us weighs the different characteristica differently (to the extent that not everybody agrees on some people's consideration of a (dis)advantage).
You have convinced me that your opinion differs but you have not convinced me to change my opinion. And vice versa:)
Bass has shown quite clearly that DC is mathematically inconsistent with McMahon/Swiss, whereas SOS is mathematically consistent with McMahon/Swiss.
You, on the other hand, have failed to provide any argument at all for the statement "DC is better than any of them".
Please don't let your personal opinion overrule what makes sense from a tournament theory perspective.
Sorry, but actually I've not seen any argument of Bass regarding SOS being compatible with McMahon in this thread.
And I wonder about all this fuss about arbitrary tie breakers if the single most important reason which makes using DC in McMahon and Swiss tournaments a little inconsistent isn't mentioned at all. That is, that you never can be sure that it will be applicable (all necessary games played), so you have to prepare a second backup tie breaker anyway.
Dear Tapir, try looking at Bass explains SOS
I try, but where?
Theory Behind SOS
The page Theory Behind SOS 1) pretends to argue like a mathematical proof but does not give one (e.g., averages or expectations are missing and characteristics of the global player field are confused with those of a single, particular player), 2) considers some theory but overlooks much other theory.
I don't suppose you have any references to that overlooked theory?
You would not listen to me when I argued for SOS. So I created the theory behind sos page, which explicitly lists all required assumptions for using SOS.
You said that generic evaluation of tie breakers was very difficult. I wrote a paper on how to do it.
Here I showed that a yes/no question exists, so that in either case using McMahon with DC is an error. This, it seems, did not convince you.
Instead you demand that I should list all assumptions required for using any tournament system whatsoever, or you keep holding the EGF recommendations hostage.
You should realize that there is a distinct danger of the EGF losing credibility as a provider of sensible rules if you keep doing that.
-Bass, 2010-01-01
Reply see further below.
Bass has not shown that DC is mathematically inconsistent with McMahon/Swiss. He has not even defined what "mathematically inconsistent" is for a pair of tournament system and tiebreaker. He has not given a mathematical proof. What he has given are some preliminary ideas of possibly how to prove something else: That globally for the whole player field of a tournament SOS in some sense (still to be defined) adds information on average of over a great (or infinite?) test series of tournaments.
DC is better than lottery / SOS-2 / SOS-1 / SOS because DC considers a player's achievements while lottery / SOS-2 / SOS-1 / SOS does not. The purpose of a tournament is to compare the achievements of the players themselves - not to compare the achievements of third persons.
DC is better than Previous Order because DC considers the much more recent data.
DC is better than Rating (here: rating just before the tournament start) especially because of various weaknesses of (here: EGF) ratings (discussed elsewhere).
As I have said earlier and we see here again, Robert is not interested in discussing tiebreakers, he merely repeats the same lies in parrot fashion.
isd, do you understand the difference between "lie" and "contribution to a discussion"? Please do not call contributors to discussion liars and do not subtitute contribution to discussion by personal attacks!
I mean lie in the technical sense
Thank you for the explanation. In a technical sense, it is not called "you lie" though but "I think that your opinion is wrong / you have made a mistake / you err". It would help if you stated exactly where you think I might have erred and why because obviously I lack time to explain each statement carefully in detail.
The currently spent hours per day is already too much time. You do not seem to realise just how much time careful and detailed answers consume.
My statement "What he has given are some preliminary ideas of possibly how to prove something else: That globally for the whole player field of a tournament SOS in some sense (still to be defined) adds information on average of over a great (or infinite?) test series of tournaments." refers to the Theory of SOS page.
This statement:
DC is better than lottery / SOS-2 / SOS-1 / SOS because DC considers a player's achievements while lottery / SOS-2 / SOS-1 / SOS does not. The purpose of a tournament is to compare the achievements of the players themselves - not to compare the achievements of third persons.
and this statement you made further down:
(with SOS)
- Earlier Wins are Rewarded More than Later Wins. I have postulated it but not proven formally yet. I have observed it for McMahon/Swiss tournaments. The aspect should be either proven or refuted. It should not be simply overlooked though.
are incompatible, they cannot both be true. Which one is false?
Being too busy these days, I have no time to decrypt your question. Please explain why incompatible!
If SOS does not consider a player's achievements, how can it reward earlier rounds more than later rounds? Wins are players achievements, and SOS rewards them, hence it considers them.
IMO it would be valid to say that SOS considers not only the player's achievements, but also those of others, but IMO it is not valid to say that SOS does not consider a player's achievements.
It is the Number of Wins Score / McMahon Score that rewards a player for his wins. SOS does not reward a player for his wins but for his opponents' wins, by definition. The rewards for a player's opponents' wins is with greater likelihood greater if the player's wins are early. The relation player - timing of his wins - SOS size is indirect: The timing of his achievements (wins) influences the likelihood of his SOS size, although SOS does not reward his wins directly.
No, for as Herman says above, this is incorrect.
If you want to convince, then state reasons explicitly.
There is no point in trying to convince somebody who does not want to accept basic facts and outputs obviously false information in defense of his argument, often aggravating this by issuing the same falsehoods twice and then even dismissing statistical theory by claiming divine right for direct comparison.
Meta-discussion does not provide reasons relevant for the topic.
I'm not sure it's for me the best way to start contributing to Sensei's Library, as it is a very disputed topic. I'm involved in it, as I have written implementation of direct confrontation in OpenGotha
For me, a go tournament is not a statistical study. It is a competition. Tie breakers are there to produce a ranking, establishing winners list, to give titles and awards.
There have been several cases where usual tie breakers have produced a ranking that seems weird to participating players. European Go Championship 2007, French Go Championship in 2006 and in 2007. I'm sure there are many more of them.
In those cases, using Face to Face Result seems to be a better tie breaker, because everybody have seen that player A has played against player B and won. If they have same MMS, players expects A to be ranked before B. This is a competition after all.
Face to Face Result alone has the drawback that it can be applied only when two players are tied. That's why we need a generalization of this tie breaker.
Direct Comparison as defined by EGF is such a generalization. It still has the drawback to be rarely applicable when more than 2 players are tied.
Direct Confrontation is another way to generalize Face to Face result. Its result is quite straightforward: it sorts ex-aequo in such way a player is above the ones he has defeated, and use lower priority tie breakers if several of such sorts are possible. Of course implementation is a bit complex, especially the way to handle cycles. But it is available in OpenGotha, and freely reusable in other software if they want it.
If you prefer to scientifically rank players according to statistics, do it plainly and wholly. I suggest using a Bayesian Elo Rating with all the tournament's games, and use it as ranking. See Rémi Coulom's work http://remi.coulom.free.fr/ Seriously, this may give a better result than our tinker things.
is there anything wrong by breaking ties arbitrarily? are there non arbitrary tie breakers?
cheers tapir
If you wanted to, you could call MMS an arbitrary tiebreaker. It's has an associated error attached to it - just as SOS (which is made up from MMS) does. As the number of rounds progresses, it seems that one player is stronger than another, but if we look (into the future or the past of the tournament) this may not be the case.
The idea behind a tiebreaker is important, some try to measure how well the player performed during the tournament, some look at how hard their opposition was, some just look for something that can used to provide a distinct winner. Nigiri is an arbitrary tiebreaker. I don't think SOS is.
Bass, here is a selection of theory about SOS that you overlook / ignore and a webpage http://home.snafu.de/jasiek/SOSqual.html :
There are many aspect of theory. Most you have overlooked / ignored so far. The much greater difficulty will be though to relate all the aspects in a common theory with common assumptions made for tournaments (or: one particular tournament).
It is not so that I would not listen when you argue in favour of SOS. I am particularly aware of incomplete arguments then though because I do not want incomplete / false arguments to be misunderstood as fake justifications.
The Theory behind SOS page does not list all required assumptions. By far not. I have not seen any paper or article that would have come close. See the theory aspects above overlooked / ignored by you and imagine some assumptions for every aspect. Then you can get a rough idea that many assumptions are missing.
I will read your PDF later. If it is the old paper though, then it did not hold its promises.
Maybe I find time later to exhibit the most important gaps in your study "Here I showed that a yes/no question exists, so that in either case using McMahon with DC is an error.". To start with, define "inconsistent" aka "mixed" and define with which meaning you use ">".
Before we understand all tournament systems for the purpose of SOS or DC theory, we may start with an understanding of one particular system. I do not expect you to solve the former before solving a particular case of the latter.
The EGF had a time when SOS was the first recommended tiebreaker (even before board points in a team tournament...) while essentially no theory existed. Now the EGF has a time when DC is recommended before SOS while some first theory exists. If you call the latter "holding the EGF recommendations hostage", then why did you never call the former "holding the EGF recommendations hostage" despite entirely missing theory at that time?
The danger is not that the EGF would lose credibility as a provider of sensible rules when the EGF recommends DC at all and over SOS. The danger would be that the EGF would lose credibility as a provider of sensible rules when the EGF would recommend SOS over DC due to highly incomplete arguments.
So you are sticking with "I will not change my mind."
Very well. Since you are the Rules Commission, you are allowed to do so, and why not let's just accept that; nothing is stopping me from ignoring your recommendations and advising others to do so too.
Just one thing for you to think about: Are you quite certain that your volunteer work is having a net positive effect on the world of go?
-Bass, 2010-01-01
You are keeping your opinion, I am keeping my opinion, so if you complain about me keeping my opinion, you should also complain about yourself for keeping your opinion. IOW, keeping or changing some opinion is not a purpose in itself. Provide reasons that convince and I might change my mind. Do not provide reasons and chances of changing my mind are smaller.
Your last question: yes, very much.
Chess has had published theory about SOS for a long time.
Please provide references! I have not read the chess-related stuff yet.
You stated there was no theory. Why, if you have done no research?
Because I have not looked for SOS at chess sites - only for Buchholz.
That comment, like many others you have, made makes no sense at all. Oh well. Nevermind, I give up.
The sense is that chess players do not speak of SOS but they call the sort of tiebreaker Buchholz or with similar names or they use variants. And the first comment I see is: "The major criticism of this system is that tie-break scores can be distorted by the set of opponents that each player plays (especially in early rounds)." So one of the variants is the Median-Buchholz System, in which the best and worst scores of a player's opponents are discarded, and the remaining scores summed.
Then I find "theory" (Citation: "The theory is to award the person who played the stronger opponents.", i.e., a wish is called theory.) that ends by hiding the most important definition in a footnote: It starts with Modified Median, which the footnote defines as "The lowest scoring opponent is disregarded only for ties between players with more wins than losses. For players tied with more losses than wins, the highest scoring opponent is disregarded. For players tied with an even number of wins and losses, both the highest and lowest scoring opponents are dropped."
It is not easy to find real theory on chess tiebreakers.
When googling for "sum of opponents' scores" tiebreaker Chess theory, Here I found "It went to the arcane Buchholz tiebreak system (sum of opponents' scores), where Russia took the medal by a single tiebreak point.".