DirectComparison/Discussion

Sub-page of DirectComparison

http://users.erols.com/crhutch/go/images/goo/f0.gif
Warning: Heated go discussion ahead!

RobertJasiek: Since an edit war is going on, there is apparently need of discussion for each suggested (dis)advantage before it can be stated on the parent page.

Advantages and disadvantages as of 2008-02-26

Advantages of Direct Comparison

  • Based solely upon the players' performance. The pairings as determined by the tournament director does not impact this tie breaker.
  • If only two players are tied, it is a visible and simple tiebreaker to use
  • It is easy to calculate manually because it works with few, small numbers.
  • It can be considered a shortcut for a continued tournament in that only the still uneliminated players are still considered.
  • It uses only direct information like comparing a game of two players A and B instead of indirect information like comparing longer chains of games between players A, B, C, A. Therefore it concentrates on the more significant information.
  • Its numerical precision is also its significance: The smallest possible difference is meaningful.
  • It does not contain any noise of useless information.

Disadvantages of Direct Comparison

  • It is only applicable for small tied groups, where all players have played all others within the group.
  • It does not necessarily take all a player's played games into consideration.
  • It puts greater weight on some games.
  • It is less easily visible in case of ties between more than 2 players.

Advantages and disadvantages as of 2008-02-27

Advantages of Direct Comparison

  • Based solely upon the performance of the tied players. The pairings as assigned by the tournament director do not impact this tie breaker.
  • If only two players are tied, it is a visible and simple tiebreaker to use

Disadvantages of Direct Comparison

  • It is only applicable for small tied groups, where all players have played all others within the group.
  • It does not consider performance over the whole tournament.
  • It puts greater weight on some games.
  • It is less easily visible in case of ties between more than 2 players.

RobertJasiek: It has been pointed out that each (dis)advantage deserves justification of its own and that furthermore one could also compare them to those of other tiebreakers. Note, however, that a (dis)advantage can exist even if it is not also compared to other tiebreakers yet. Besides this is not the core comparison page.


Discussion of: Based solely upon the players' performance. The pairings as determined by the tournament director does not impact this tie breaker.

RobertJasiek:

  1. This addresses two aspects that should be discussed separately.
  2. "Based solely upon the players' performance." The performance here is that performance during the tournament. If the aim is to measure only this, then it is an advantage that only this is measured. If the aim were to measure also performance outside the tournament, then it would be a disadvantage. They very purpose of a tournament is to measure and compare performance just during it because otherwise we are speaking about different things like, e.g., "publication of a rating order on a particular date". Since the aim agrees to the intention of having a tournament, it is an advantage indeed that only the players' performance is measured.

Bass: The direct encounter does not measure the player's performance. The MMS already measured the player's performance and used up DC's data point when doing so. DC measures the effect of multiplying an arbitrary data sample. To say "it is based on the players' performance" is misleading, when it is actually based on an elementary statistical mistake.

RobertJasiek:

  1. Direct comparision measures part of the player's performance: The part performed against those opponents tied with him.
  2. Measuring that part of the player's performance after the first criterion MMS has already measured all the player's performance puts the weight 2 on (you call that: multiplies) that part of the player's performance.
  3. "arbitrary data sample": Quite contrarily, the part chosen for the weight 2 is not arbitrary but well-defined as those rounds in that the player played against the opponents tied with him.
  4. Which is the "elementary statistical mistake"? The choice of the aforementioned rounds with weight 2? Any choice of any rounds with a weight different from 1 (that would invalidate any tiebreaker, right)?
  5. It is open to discussion whether the basic nature of direct comparison to put weight 2 on those rounds of a player in that he met his tied opponents is an advantage or a disadvantage or neither. But where is the "elementary statistical mistake"? Making such a basic choice for or against a tiebreaker is not a statistical mistake but the choice and setting of an aim.

Bass: See (much) below for an explanation why a McMahon tournament's aim can never be chosen so that the DC becomes a meaningful tie breaker. Bass: Also, "setting an aim" was not mentioned in the text that I removed from the page, and neither is it in the EGF's text, which is, no doubt, written by you. That text claims that the DC can be used a shortcut to a rematch. This is an elementary statistics mistake. If you fail to see anything wrong in using the result of one coin toss of "heads" to decide that the result of 2 coin tosses must clearly be "heads twice", I do not think we can continue this discussion.

RobertJasiek:

  1. (I will reflect your statements much below later.)

After having read all your notes carefully, I could not find one explaining your statement "an explanation why a McMahon tournament's aim can never be chosen so that the DC becomes a meaningful tie breaker.". Please explain carefully! In particular, why do you think that the subtournament view is inappropriate?

Bass: For situations where the number of rounds is much smaller than the number of players, McMahon requires the assumption that if A wins B and B wins C, then A can be assumed to win C. This is how the swiss type systems can justify leaving some games unplayed. Therefore, any measured ring in a McMahon tournament must always be a measurement error. Assuming a sane and informed TD, there is no way he would assign more importance to a known measurement error than the other, non-questionable results.

RobertJasiek:

  1. Why should MacMahon require this ABC assumption? If this assumption is used, then we would want to imply A>B, B>C, and A>C. Nevertheless, the actual game results could become A>B, B>C, and C>A. This shows that the assumption was badly chosen: one of its implications contradicts reality. (Likewise, the same behaviour can occur for ratings.)

Bass: Swiss also requires that results can be chained. Given N players and k rounds: If you do not use that assumption, then you only have the data about the played games, which is k*N/2 edges of the overall who-wins-who graph. As the whole who-wins-who graph has N*(N-1)/2 edges, the ratio of measured data to the whole shebang is k/(N-1), which becomes very small in big tournaments. Without the chaining assumption the results given by the swiss system should be discarded as representing an insignificantly small portion of the data.

RobertJasiek: It is unfortunate that the ratio is small but the purpose of a Swiss type tournament is not to fill all edges. It is well known that with many participants the data approach insignificance in the global ratio sense.

Bass: Very well then. Just to be silly, I'll amend my argument thusly: If the McMahon system is used by a sane, knowledgeable tournament director who does believe that his system is measuring something, anything at all, then the DC tie breaker is not to be used.

Bass: In reality though, the swiss type systems do make the chainability assumption, and even the further assumption that anybody with a better score is likely to win against anybody with a lower score. A TD accepting these approximations can gain a whole lot of information from a few well-chosen data samples, which is why these tournament systems are in popular use.

RobertJasiek:

  1. Do you have evidence for Swiss go tournaments making those assumptions? In practice, the most often Swiss is used without reflecting about assumptions at all. It is a different questions whether one should make such or similar assumptions. I have already refuted the chainability assumption.

Bass: Here's the evidence: the swiss system makes statements about the relative order of players who haven't faced each other and bases these arguments on comparing single scalar number. You cannot make single scalar numbers go into a loop no matter how hard you try. This means chaining is assumed. You should not be ashamed of chaining, everybody assumes it. It is even everything the knockout system is all about.

Here we see Bass arguing that a single scalar number to represent a player's strength is assumed by the Swiss system and the SOS that he advocates. Please compare with Bass comments on the TieBreaker/TestingTiebreakers where Bass argues against using such a scalar: ''This is a biased test: it assumes there is a scalar quantity that describes a player's winning probability'...'.

Bass: Thank you for anonymously commending my ability to self-critique, but there is a distinction between "making an assumption in a tournament system" and "making an assumption for the purpose of measuring tournament systems". Any testing system that makes an assumption that would guarantee that the results are in accordance with my opinions would of course produce results that are agreeable to me. Such results might even convince somebody else. They would contain the flaw of not necessarily being related to the properties of the real world. If you have trouble believing this, you might want to compare with [ext] Bertrand's paradox, which is a simple example where the result of mathematical estimation reflects mostly the properties of the estimation method itself.

  1. "anybody with a better score is likely to win against anybody with a lower score": I do not think that this assumption is made or necessary. Instead something related can be considered assumptions: "It is fair to place players with more wins so far higher in the current tournament table. It is fair to pair players with so far the same number of wins against each other (as far as possible)."
  2. (Further discussion of Swiss should be discussed elsewhere, e.g., on a discussion subpage of a Swiss system page.)
  1. If the assumption could be used and it (because of also all its implications) were always be true, then it would serve as a justification for Swiss as you indicate. However, since it is not always true, this is not a valid justification for Swiss type systems. It is very tempting but only wishful thinking.
  2. Without going into every detail, the rest of your logic chain about MacMahon and statistics is pretty well constructed. However, since I do not agree with the initial assumption, I also do not need to call acutal instances of game results not fitting the assumtion "measurement errors". For me, that sounds like saying to C: "You were not allowed to win against A!" Absurd. Not that player is wrong but your initial assumption.
  3. Besides I do not consider a MacMahon tournament a statistics in itself because statistics are data about (here numerically represented) facts. (One might clone the facts to create a statistical model data set, but that is another thing.)

Bass: The subtournament view (use only the results of the tied players) is a way to do exactly that: ignore any good data and look for meaning in the known measurement error.

Bass: The correct way to deal with measurement errors is to measure again. If this is not possible, one should in order of precedence, 1) declare unresolvable, or 2) try to estimate the additional data by a statistically valid means. Cherry picking data points (not to mention, choosing the most suspect data) to be your estimate is not statistically valid.

Bass: The previous arguments can also be turned around: Anyone wanting to use DC as a tie breaker is making a serious mistake when choosing the McMahon system, because the McMahon system makes assumptions that cause the results of DC to become meaningless or worse.

Bass: Or in yet other words, the aim for the MMS/DC combo is "Measure first by MMS, then break ties by going against the justification for MMS". This is an internally inconsistent aim, and such aims need not be considered when evaluating tie breakers. (Any tournament system and tie breaker combination can be justified by saying "maybe that is exactly what the TD wants". I would rather concentrate on evaluating things within "what a knowledgeable TD might want".)

RobertJasiek: Which aim of the MacMahon system is it that you then see used contrarily for DC? How so?

  1. I have asked for a study of aims for years and still not found time myself to create a reasonably useful list of aims. Aims should come before (dis)advantages. That we are "already" discussing the latter is a bit out of order, but cannot be helpful. We just have to live with the fact that research in aims is hardly existing yet and that this may lead to more confusing than we would like.

Bass I have done a study of aims of tournament systems because I could not think of another method to meaningfully evaluate tie breakers, but because it takes at least a month to get any single fact to stick on this wiki (there is always the helpful expert to revert my edits, hi Robert, hi velobici), it will probably take at least a year or two before we get anywhere near done.

RobertJasiek: Post such aims on your webpage (URL here) or on rec.games.go and I will read it.

  1. The EGF Tournament System Rules do not study aims systematically, either. The reason for that is mainly just to get a shorter rules text. Importance of aims has not been a reason for the missing coverage there.
  2. Possibly the EGF text's claim that DC was a shortcut to a rematch is not the best justification because it is born out of opinion (which is more widely spread than within the EGF Rules Commission though) much more than systematic study. Maybe we will come to the conclusion that the tournament system rather than the tiebreaker is a shortcut to rematch / knockout? Maybe not.
  3. Something is not an "elementary statistics mistake" because you repeat that many times. You do not explain this thoroughly yet. I am not even sure whether you criticize the weighting or see a different kind of problem here.

Bass: Hopefully my explanation above helps. If it did not, I'll try to figure out another way of explaining it.

  1. I understand you Coin experiment and that one must not count one experiment as if it were two experiments. However, the weighing of using its information for the second time is different from such abuse of "experiments": It is a new usage of (the same) information by viewing upon it in a different context. The first context is: Look for the highest MMS players. The second context is: Compare the then tied players as a group among each other; compare this subtournament; find out how well they have played against each other.

Bass: why you should not choose that context is, hopefully, explained in my earlier comment.

  1. Now you may be against all tiebreakers because they view upon the game results of all or some players in more than one context. (A truly orthogonal and therefore not conflicting context is used by Coin Tossing or Player Age whereas DC, SOS, etc. would be prohibited.) But if you allow views from more than context, then DC, SOS, etc. become eligible at least in principle.

Bass: In the McMahon system, the aim of MMS can be assumed to exist. Any tie breaker making no additional assumptions and having identical aim can be used without changing the context. On the Theory behind SOS page I have listed all the assumptions that are needed to use SOS in a manner that has identical aim with MMS.

  1. The aspect should be considered from another perspective as well though: "Based solely upon the players' performance." can mean diffrent things: Measuring each individual player's performance or all participating players' performance in their mutual context. The stated aspect is too imprecise to distinguish between these two. So actually we do not need to distinguish that for the purpose of this aspect. We just have to be aware that it is rather unspecific. E.g., SOS shares the same advantage although it thereby collects also noise. To discuss such and why noise is not contained in direct comparison, one needs more precise aspects than "Based solely upon the players' performance.".

Bass: This point seemed rather content-free.

RobertJasiek: Please use reasons.

  1. "The pairings as determined by the tournament director does not impact this tie breaker.": The tiebreaker does not reflect that the TD has made the pairings, how he has made them, in which order pairings are assigned to rounds because the values calculated by the tiebreaker are invariant under these aspects. (With one exception: Player A may have player B as angstgegner; so if the pairing does pair A against B at all, it is A's bad luck; A would have had an easier game if paired against C instead.) Since the purpose of a tournament is to measure the players' performance and is not to measure the TD's performance (although this may be praised, criticised, or evaluated regardless), we have the aspect as an advantage indeed.

Bass: Checking whether the pairing was fair is what any sane tournament director should do in the case of a tied result. DC is not impacted by it, and neither is DC impacted by doing anything else a tournament director would ever want a tie breaker to do. (such as "my tournament should measure something" and "I want my tournament results to approximate those of a round-robin tournament" or "in case of a tie, I want to reward the player most likely to win a rematch")

RobertJasiek:

  1. A TD should never start checking the fairness of pairings after the tournament because that could lead to biased manipulations of the final player order. Instead a TD should evaluate the fairness of pairngs before the start of the tournament and, if necessary, during the tournament. (In hand-pairing, it is an art to make fair pairings for the last ca. 2 rounds.)
  2. DC is not (really) affected by a TD's wishes after the tournament start. However, the choice to use DC for the tournament at all before its starts is closely related to general aims (which might agree to the TD's wishes).

Discussion of: If only two players are tied, it is a visible and simple tiebreaker to use

RobertJasiek:

  1. It is advantage that something is visible because then it can be accessed more easily.

Bass: Rock-Paper-Scissors is more visible and less based on a misconception.

RobertJasiek: I do not think that it is helpful to use many different terms for the same thing: Can we please continue usage of Coin Tossing instead of Rock-Paper-Scissors? - I agree on your opinion here. I just thing that we should keep discussion of comparison with other tiebreakers separate from discussion of DC itself. Then we have a greater chance to come to final conclusions because discussion has a clearer structure.

  1. It is advantage that something is simple because then it can be understood more easily.

Bass: This is the crux of the matter. DC is is easy to misunderstand. Because you refuse to comment on any referenced articles, here it is again: use of DC is based on a statistical misunderstanding. There are tournaments where Rock-Paper-Scissors gives more reliable results than DC.

RobertJasiek:

  1. Understanding DC is rather easy (if we ignore details about (non-)iterative for the moment). What is harder to understand about DC is matters of its quality.
  2. I do not refuse to comment on any referenced article but I cannot take time to comment on all and do not have time to verify constantly whether their contents might have changed. Then it is not clear which parts of their contents you are referring to. Assuming all their contents makes discussion impossible because I be having to do nothing else then. It is simply much easier to see arguments stated here. We can see them directly and know exactly which you want to refer to.
  3. "statistical misunderstanding": See above.
  4. In which sense does Coin give more reliable results than DC?
  5. In which (types of) tournaments does Coin give more reliable results than DC?
  1. It is correct that the case of only two being tied lets direct comparison be visible because one sees the information as the game result between the two players.

Bass: It is exactly the case of two players being tied where it is difficult to see that there necessarily is some kind of a loop in the data if two players are tied with a similar SOS. In case of three players, the loop can sometimes be seen among the tied players themselves, which makes it easier to spot the elementary statistical mistake.

  1. It is correct that the case of only two being tied lets direct comparison be simple because it suffices to compare only the players in question, only two such players, and only their game result.

Bass: No, it bloody well does not suffice! If you want to measure the fairness of a coin by tossing it a thousand times, you do not toss it once, say "this will suffice" and multiply the result by a thousand! I've already explained this on 2 separate places on this wiki, but since you argue above that linking to those places is not good enough, here you go again. If two players, called A and B, are tied for the first place, and A has beaten be in their mutual encounter, then because they have the same MMS, A has lost to somebody else. Since A and B are at the top of the tournament, this means A has lost to a player weaker than B. Therefore, the tie should be broken in favor of B, who has only lost to a player at the top score. (and no, you should not use this one as your tie breaker either)

RobertJasiek:

  1. As for weight on DC, see farther above.
  2. As a general note, which BTW is one of the basic criticisms of SODOS and - less so - SOS, there is no a priori reason to consider only either of a player's won games or lost games more important than the other of these two.
  3. Unfortunately, contrarily your argument implicitly relies on the arbitrary assumption that a player's lost games shall be more important than his won games.
  4. One can make a different but similar general note: To wonder why either of a player's games against the opponents tied with him or his games against the remaining opponents shall be more important than the other of these two. For the purpose of evaluating only one player's performance, this is an arbitrary choice indeed. However, the reason used for DC is a different one: It is a group of players that is considered together and among which each player's performance within it shall be compared. DC evaluates the subtournament of only the mutually tied players at the top within the tournament field of all its participants. The group of players, its choice, and the performance of its players against each other is not arbitrary but reflects exactly those players being tied for the competition of the top place(s).
  1. To a lesser degree, the advantage occurs also for ties between more players. We may say though that for exactly two tied players the tiebreaker is the most visible and the simplest.

Bass: What is clearly visible is the bit about the tiebreaker that makes it less than trivial to see the tie breaker's flaws. If you can convince the players to believe sorting by their age is somehow an acceptable tie breaker, then by all means, use that. However, not a single informed go player will ever accept that silliness, and neither will they accept the argument of DC being anything like substitute for a rematch.

RobertJasiek: I do not participate in sociologic speculation here.

IanDavis: What I meant by this point was that it was easy to see (visible) who won the tournament. You don't have to use your brain to add up a lot of SOS SOSOS etc. For only two players it is very simple to perform. For over two players it is difficult to perform because the EGF definition doesn't make sense to even a native speaker of the English language. In this sense, as an arbitrary rule to produce a winner, it is excellent, and can be used again and again without accidentally getting wrong who won the tournament on tiebreak.

Discussion of: It is easy to calculate manually because it works with few, small numbers.

RobertJasiek:

  1. It is correct that the involved numbers are small: If the tierbeaker cannot be applied, the value is 0. If N players are tied, then the maximal possible value is N-1. In particular in comparison to other tiebreakers (like, e.g., SOS in MacMahon calculated relatively to the top bar) the numbers for direct comparison are relative small or very small. In practice, often not more than two or three players are tied so that the values of direct comparison are only 1 or 2 - very small numbers indeed.
    - Herman Hiddema: Actually, I would say that the number is (N^2-N)/2, because each of the N players played against the other N-1 (if they did not, the tie breaker is inapplicable). For SOS, the number is N*R (R=number of rounds). As R must be at least N-1 (otherwise, there were not enough rounds for the N players to play all the others), N*R will always be bigger, with Direct Comparison approaching N*R/2 as N grows toward R.
  2. It is correct that only few numbers need to be calculated: Only for tied players. For them only such related to their mutual games.
  3. Common sense justifies the experience that manual calculation with few and small numbers is easy.
  4. This is an advantage because players and TDs prefer few and small over many and big numbers, prefer the availability of manual calculation so that one does not depend on whether a PC is available and woring, and prefer the availability of easy manual calculation because it allows an easy verification whether a PC has done the calculations correctly. It particular, the players do not need to trust the TDs blindly but can easily verify correctness by their own manual double checking.

IanDavis: this repeats exactly an earlier advantage. Therefore it was removed from the page.

Discussion of: It can be considered a shortcut for a continued tournament in that only the still uneliminated players are still considered.

Bass: This is a misconception based on an elementary statistics error. Ergo: not true.

Velobici: Can you explain this? Showing the math would do the trick. English is a very poor way to convey mathematical material.

RobertJasiek:

  1. Whether "shortcut for a continued tournament" is an advantage depends on the aims of the tournament. If the aim is to have a tournament system that approaches knockout (but because of time contraints cannot be knockout) or the aim is to produce winners at the top of the final players field, then it is an advantage because it is indeed a shortcut for a continued tournament, because it does indeed still consider only the still uneliminated players, and it does compare them. If the aim of the tournament is to evaluate the results of all participants, then it is a disadvantage because direct comparison does not fulfil that aim.

Bass: Swiss and McMahon systems aim to be the shortcut to a round-robin tournament. Knockout needs not be approached by a tie breaker, it can be approached by setting the McMahon bar. If you want to cook up a tournament organiser who is sophisticated enough to choose tie breakers, but not smart enough to choose a proper tournament system or its parameters according to his wants, then this argument might have some merit.

  1. The discussed aspect is whether direct comparison "can be considered" - not whether it always is. As discussed above, it depends on the set aims. Since it is possible to set the aims for the tournament fittingly, the weak form of general advantage ("can be considered") is given.

Bass: More to the point: if it ever is, it is so by pure chance. As discussed above, the use of DC requires a childlike belief in a statistical error, or the incompetence of the organiser using it.

RobertJasiek: Please use reasons instead.

  1. Whether the aspect is an advantage for a particular tournament requires a look on the actually used aims for it.

Bass: This statement seems to refer to the childlike claim that DC can be used as a shortcut to a continued tournament, which I will gladly prove wrong for as many times as it is necessary to stop this kind of stupidities being spread as the official opinions of EGF.

RobertJasiek: Please use reasons instead.

Bass: It is possible that a tournament director firmly opposes the assumption that "A > B and B > C means A > C". If this is case, then the director would do well to use round-robin with the DC tie breaker: the direction of any loop is significant because it reflects something the tournament organiser thinks to exist and wants to measure.

Bass: If the tournament director does not have that belief (this is always the case in McMahon and Swiss tournaments, because the systems heavily depend on that assumption) then a measured loop is always to be considered a measurement error. Basing a tie breaker on broken statistics applied to a known measurement error may indeed produce meaningful results. However, this will never be because of technical merit, but by pure chance alone.

RobertJasiek: Please explain better: "If the tournament director does not have that belief (this is always the case in McMahon and Swiss tournaments, because the systems heavily depend on that assumption) then a measured loop is always to be considered a measurement error."

IanDavis: I agree with Bass. What you are saying in this point is that DirectComparison can be considered a Rematch. It is quite clearly not a rematch. As a non arbitrary rule, and a tool for predicting who will win if their are more rounds, it is deeply suspicious. This really is elementary mathematics, and we have already elaborated too much upon this. Any tiebreaker can be considered a short cut for a continued tournament, that is a tautology. To hold an advantage it must be good at the prediction, there is no evidence of that.

RobertJasiek: Your last sentence is a good argument, thanks.

Discussion of: numerical precision is also its significance

  • Its numerical precision is also its significance: The smallest possible difference is meaningful.
  • It does not contain any noise of useless information.

IanDavis: DirectComparison is not a number, and has not been defined as such by anyone. 1 result is not a reliable sample, so claiming it is free of noise is not meaningful. I cannot see how these two advantages can be taken seriously.

RobertJasiek:

  1. I have not meant to use the expression "It does not contain any noise of useless information." as another advantage but as saying more informally in other words what the significance means. So may can as well ignore your 1 result comment here. (You may bring forward this example it for one of the disadvantages.)
  2. Whether DC is expressed by numbers is a matter of preferred annotation. It can be expressed by numbers. Therefore it is also possible to speak about numerical precision.
  3. The aspect should be a feature of every tiebreaker. That many tiebreakers do not have it (often not even closely) lets DC be (much) better than those with respect to this aspect.
  4. The aspect is an advantage already simply because a tiebreaker is better off having than not having it.
  5. For those not knowing what significance in its mathematical sense is: If you measure the distance of a nearby village as roughly 3 km, then adding the city wall's thickness of 20 cm to that is meaningless because it is far below the significance of the sum of the two numbers, which is about +-1 km (the order of magnitude of a typical error you make in your estimate).

IanDavis: Now you have clarified what you mean, but as advantages they are still total rubbish unfortunately. Robert, you have forgotten to think about the variation possible. DirectComparison comes from one result, one game (normally). You complain that SOS et al is insignificant. How much variation can we expect from DirectComparison? Remember that standard deviation from a sample of 1 is infinite.

RobertJasiek:

  1. If you do not want to call it an advantage, we may also call it as neutral for DC (because no extra harm is done) and as disadvantage for SOS, SOSOS, etc. because there noise and imprecision exist prominently.
  2. If standard deviation is inapplicable, then why use standard deviation? The concept of significance does not have to rely on standard deviation. E.g., one can also define significance via the maximal possible noise error. (In the village example, we have the significance 1 km (maximum) instead of +-1 km (standard deviation)). The maximal noise error for DC is 0.

IanDavis: No for DirectComparison the maximal error is 1 meaning that if you talk about significance in the mathematical sense, that 1 and 0 cannot be safely distinguished (significance, standard deviation, standard error call it what you will.) So this is trivially wrong as an advantage.

RobertJasiek: Please demomstrate in theory or practice why there can be an error of size 1 and why it is an error at all!

IanDavis:I will give you the benefit of the doubt for a second and imagine that your question is genuine. So you wrote The aspect should be a feature of every tiebreaker. That many tiebreakers do not have it (often not even closely) lets DC be (much) better than those with respect to this aspect. I presume by that you mean that when a player has a SOS of 53 you wouldn't expect that to actually be 53.4 or 52.985 for instance, much like we wouldn't expect DC to be 0.9 or 1.1. You are in short basically referring to Standard Error. The expected result.. well this can be off by such and such. With more data we could get a nice reliable value. (etc) So then. You are talking about predictions. How reliable is this in predicting who the winner would be. Now here I repeat myself if I continue further.

RobertJasiek:

  1. In principle, there are a (tiebreaker) value T, its error difference D between what T should be and what it is, and various tools of statistics (among which Standard Error is one). In your example, T=53. If T has an error and should be something else, then that something else can be only a natural number or a natural number plus 1/2. For SOS, other rational numbers cannot be assumed because SOS is calculated from game results, 1 for a win, 1/2 for a jigo, 0 for a loss (plus some shift for the score group, which is a natural number at the tournament start). T with error cannot assume the values 53.4 or 52.985 as you suggest in your example.
  2. Let us assume that T with error we have T=55 (while without error it should be 53). Then the error is 55 - 53 = 2.
  3. Now let is us change to direct comparison. Assume that the player has won the considered game, so for him we have T=1. Although we use a different tiebreaker, also this one relies on 1 for a win, 1/2 for a jigo, 0 for a loss. The values 0.9 or 1.1 cannot be assumed by T. Now let us try construct an error. The only mistakes that are available, for the sake of the argument, are T=1/2 or T=0. Explain us why and how these two values could be assumed for the player's mentioned game that he has won? This is not possible since direct comparison is well-defined: Iff a player wins, then for that he gets T=1. Errors do not make it through this definition. You might want to invent a different sort of error like the SOS-like pairing luck in early rounds or the SOS-like mirror at the top or bottom of the playing field. Invent such an error and demonstrate how it finds its way into the T=1 value for direct comparison! E.g., do earth rays let win the player iff he played against player B instead of player C in round 1? Of course, not - because it the player himself that is responsible for his win (and not the earth rays related to round 1).

Herman Hiddema: Sorry, but this argument does not hold. Under (1) you define error as: difference D between what T should be and what it is. Under (2) you give an example for SOS: T with error we have T=55 (while without error it should be 53). Now the question becomes: how do we know what T should be? There are IMO two possibilities:

  1. T should be what it should be according to the mathematical definition of the tie breaker in question. In which case SOS should be the Sum of Opponents' Scores and therefore SOS can never contain error (not can any other tie breaker)
  2. T should be the value that breaks ties most fairly, In which case for DC (with two players) the value of T can be 0, 0.5 or 1, depending on your definition of "breaks the tie most fairly". (With 3 players, DC can be any of 0, 0.5, 1, 1.5 or 2)

RobertJasiek: Yes, that is the fundamental approach we are faced with. From "most fairly" we have to derive errors, if there are any in this sense. (This is the difficult task for us.)

IanDavis: So we consider MMS then SOS then DirectComparison. The primary ability measure is MMS, SOS is a secondary ability measure, DirectComparison is another different measure. Imagine a 20 round event. At round 10 Goatlord and Dogbreath are tied with MMS 15, SOS 123 and 129 and DC 1 and 0. DirectComparison says absolutely that Goatlord will go onto win, it is either right or wrong. SOS speculates that Dogbreath has had the harder opponents and will probably go onto win (123:129 ratio end MMS score). The errors are the end results at round 20 aren't they. DirectComparison will either be right or wrong , 1 or 0, maximum error 1, not 0 unless you say DirectComparison will always be right. SOS ultimately also will either be right or wrong, but it has the smaller error. It's maximum error is unlikely to be 129, its own magnitude. It will be the end ratio of MMS. That is why in terms of prediction, which is when we want to use the significance argument, we do not prefer DirectComparison.

RobertJasiek: You have explained what the maximal error can be if there is some error at all and the ratio of a typical error and a typical tiebreaker value. However, you have not shown that some error does exist at all for direct comparion.

IanDavis: Is that supposed to be funny Robert? If there is an error, it is the maximum error. If there is not an error it is right. DirectComparison only says Dogbreath or Goatlord will win. So DirectComparison will either be completely right or completely wrong with its prediction. SOS doesn't say that in such stark terms, but we as humans reduce it to such stark terms for the purpose of a tiebreaker. That is the difference, try to make your brain understand that please, for the sake of humanity try. If you can't tough, it's still the case, and everyone else knows it to be. SOS may be slightly wrong or completely right.

RobertJasiek: I have understood you here (and of course I am not being funny.) But have you understood me? If DC has an error, it is terribly big. If DC does not have an error, it is completely right. So much also you say. - But I claim more: DC (applied as the primary and only tiebreaker) never has any error at all. I cannot prove this formally yet because deriving this from our axiom "tiebreakers shall be fair" has not been studied in all possible ways yet. At the same time, nobody has ever demonstrated an example of an existing (numerical) error of DC. I am asking you to show us the first example ever! Otherwise we may as well continue the assumption that by experience DC is without such an error.

IanDavis: Take any tournament with a clear winner who lost his a game. Now investigate the DirectComparison of the said game which the tournament winner lost. In this case you will find that DirectComparison was totally wrong about who would win the tournament.

RobertJasiek: Hoehere Gewalt (what is the called in English) is not within the scope of tournament system study.

IanDavis: Neither is idiocy

Herman Hiddema: OK. Proposed definition of fair tiebreaker:

"If, after X rounds, players are tied, then a fair tiebreaker is the one that is best able to predict the outcome of the tournament if one or more extra rounds were to be played."

This, in my opinion, is a good definition. There is, I think, little doubt that McMahon works better over more rounds. Opinions on this definition?

Sounds entirely fair, the alternative is an arbitrary rule. -IanDavis

RobertJasiek: I want to be more general: Fairness exists before we have chosen a tournament system, before we have chosen tiebreakers, before we have decided whether to use tiebreakers, before we know about specific players, before we have decided whether to evaluated restrospect, prediction, or something else. This makes definition of fair very difficult, but we can hope that for the purpose of evaluating a specific tiebreaker within a specific tournament system (etc.) we can derive simplifying statements from a yet to be found general fairness definition. It just won't be as simple as you suggest and won't be as, eh, prejudiced (by wanting to make predictions but not a retrospect). At least you suggestion is another motivation for how to think about what to look for as a definion. I have been criticised for asking for too much wrt fairness but ask yourself the general questions: Isn't fairness indeed a very broad and fundamental concept, which is not even limited to only Go, only games, or only events?

Herman Hiddema: I would say that, starting from the real basics, the reasoning goes something like:

  • Problem: We have a group of players, and want to hold a contest to determine the best one.
  • Solution: Play round robin
  • Problem: Its 100 players and we don't have time for 99 rounds!
  • Solution: Play Swiss
  • Problem: We still need 7 rounds like this, and a lot of the early game are utterly predictable ones of the type "6d beats 15k"
  • Solution: Play McMahon
  • Problem: We can now determine a winner in 5 rounds, most of the time, and have done away with a lot of useless matches that are utterly predictable, but sometimes, players and in a tie and we really really need a single winner!
  • Solution: Play more rounds or use a tiebreaker.

This is the line of reasoning that gets me to "tie breakers should strive to have the same result as playing more rounds".

RobertJasiek: Most tiebreakers (like DC or SOS) evaluate what has happened during the tournament. They are not probability variables predicting the future. Do you want to argue that only the latter (like GoR or ML) should be used as tiebreakers?

Herman Hiddema: So you feel that neither SOS nor DC has any predictive value?

RobertJasiek: Not exactly "not any" but rather little.

IanDavis: Entirely sensible. Robert's claim of infallibility for DirectComparison But I claim more: DC (applied as the primary and only tiebreaker) never has any error at all. is not at all sensible. Having just noticed this I now refuse to debate any other point Robert mentions on tiebreakers. It is like playing music to a cow. Thankfully I have made all my points clear on this tiebreaker.

RobertJasiek: Everybody is invited to construct a real error (not just a for fun error of Higher Force).

Herman Hiddema: I think that your argument I think DC contains no error, I have not proven this, but lets assume it is true is not very constructive. It is your claim, it requires you to prove it. Challenging others to disprove it is an unscientific approach. You say you nobody has ever demonstrated an example of an existing (numerical) error of DC. Do you you have a definition of what might constitute a (numerical) error in DC? Earlier in the discussion, you agreed that my introduction of the term fair is correct, and stated that From "most fairly" we have to derive errors, if there are any in this sense. I must assume therefore, that you have a definition of possible errors in DC that include this fairness property?

RobertJasiek: I agree that making a claim without proving it is not scientific yet. It is like an unproven proposition. However, empirical evidence is overwhelming: DC does not have any of the significance errors of SOS: unpredictable performance of opponents after having played them, early rounds pairing luck, mirror effect at the ends of the tournament table. That DC does not have these errors applies to all past and future tournaments. Stochastics is a weak science but better than none. - I would like to have a definition but I do not have one. So far I can "only" observe which kinds of numerical errors are known from other tiebreakers and that they do not impact DC.

Herman Hiddema: Well, empirical evidence requires testability. This requires that there is some test that allows you to judge your observations. If you have no such test, then references to overwhelming empirical evidence are meaningless. Also, it is not true that DC is not vulnerable to unpredictable performance of opponents after having played them, as you claim. Simple example: after 4 rounds of play player A has 4 points. He has defeated player B, who has 3 points. Player C also has 3 points, has been defeated by B earlier and has not played A yet. For the final round, A is paired against C, while B is paired against some other player D. Player C defeats A, so they both have 4 points. The outcome of the tournament now hinges on the game B-D. If B wins, there will be a three way tie. If B loses, C will be the winner by DC. So for both A and C, the outcome by DC hinges on unpredictable performance of an opponent they played earlier.

RobertJasiek:

  1. Maybe someone invents tests later.
  2. This is a good example, not because it would show imprecision of DC but because it tells us to define more precisely the moments when we evaluate a tiebreaker.
  3. For the purpose of DC being a tiebreaker in the ordering of the final results list, it matters what the DC values are after the last round. Before that, DC might also be used for other purposes like pairing strategy or the participants' prediction fun of what will be the DC values after the tournament. However, such other purposes have to be studied separately for themselves.
  4. The tiebreaker applies only after the (Swiss or MM) score criterion. The other game B-D decides the score of B and therefore the members of the score group with 4 points. Only once that score group is formed after the last round do we start to apply the DC tiebreaker. A comparison to the score group possibly formed differently does not matter because the score group is not formed differently at the moment of DC application for the purpose of final result ordering.

Herman Hiddema:

  1. So your whole argument then basically comes down to: I think DC has no error. I cannot prove this. I have no evidence for this. Lets assume it is true and challenge others to disprove it. That is scientifically completely unacceptable.
  2. - 4. While it is true that the game B-D does not influence the outcome after we have discarded it for purposes of calculating DC, this is IMO not the issue at hand. The issue at hand is that for purpose of determining the final ordering of the players, if DC is the primary tiebreaker, then unpredictable performance of opponents after having played them can influence the final ordering. One of the listed advantages of DC is that it depends only on the players own performance, while for SOS it is listed that others may influence the outcome. If, in the above example, players B and C are friends, B might deliberately throw the game in order to let C take the tournament. As such, DC is not independent of the results of others. This is a very clear weak point of DC. For calculating DC, we discard the game B-D as if its result is irrelevant to the outcome of the tournament.

RobertJasiek:

  1. It is not easy to dissolve what the assumptions and what the conclusions are. Let me assume for the moment you are right that unpredictable performance of opponents after having played them influences DC. What then is the effect on DC (if unsportsmanlike behaviour is not involved)? Each player gets exactly the DC value he deserves. So there is no numerical error in them for the purpose of evaluating each single player's DC for himself. Instead the impact on DC is the change of nature of the graph of win arrows: Either it is a tree or it is a cycle. I am not sure yet how we should evalute that kind if difference (if your view should be right).
  2. That DC evaluates only a subset of all a player's games may be considered a disadvantage because less information is considered than available or an advantage because it sometimes allows to break ties in cases that are otherwise not broken (and because the selection of games is meaningful).
  3. Unsportsmanlike behaviour is unfair. Therefore it violates one of the most fundamental aims of a tournament: to be fair to all players. So unsportsmanlike behaviour must be ruled out of further consideration already on the fundamental aims level, i.e., many interpretation levels before reaching the details of tiebreaker application. In this sense of exclusion in processual early time, unsportsmanlike behaviour does not affect the quality of tiebreakers. Of course, in practice every amok run affects tiebreakers in practice but I think we should not worry about higher force of any kind in the tiebreaker discussion.

Herman Hiddema:

  1. there is no numerical error in them for the purpose of evaluating each single player's DC for himself: the same argument about can be made for SOS or any other tiebreaker. This is circular. If you take the definition of a tie breaker as the evaluation criterion for errors, then no tie breaker can contain an error.
  2. This is true, and so as it is unclear whether this is an advantage or a disadvantage, it should not be listed in either list.
  3. Yes, I agree completely that unsportsmanlike behaviour is unfair. Sadly, it is also a reality. Luckily, in my experience, it is very rare. But it has been an issue before. As I understand it, the CUSS tie breaker was introduced because it does not depend on opponent performance at all. So if a tie breaker is not vulnerable to unsportsmanlike behaviour, that is a plus, but it should not be a primary concern in choosing a tie breaker.

RobertJasiek: I have meant to refer to numerical errors related to fairness - not to the plain tiebreaker definition.

Herman Hiddema: Ok. Do you have a good definition of numerical error in relation to fairness then?

RobertJasiek:

  1. Error can be defined as you would expect it but I have not defined "fair" yet. So I cannot show the relation to fair.
  2. My first idea for defining "fairness" is to let it be a number N (as big as the number of players) and distribute it equally to all players. IOW, each player should get the portion 1 of all available fairness. This can then be compared to what he gets in reality. For that, we need to define the aspects to be observed like "at the start of the tournament, equal chances (right) for one's potential score improvement". If there should be x such aspects, then for each each player can get 1/x of the fairness if everything is fair for him or less if it is not.

Discussion of: Based solely upon the players' performance

  • Based solely upon the players' performance. The pairings as determined by the tournament director does not impact this tie breaker.
  • It can be considered a shortcut for a continued tournament in that only the still uneliminated players are still considered.
  • It uses only direct information like comparing a game of two players A and B instead of indirect information like comparing longer chains of games between players A, B, C, A. Therefore it concentrates on the more significant information.

IanDavis: We note that these reasons are basically the same thing, repeating the same reason twice doesn't make it twice as reasonable. There might be some irony there. Saying direct results are more significant is completely out of nowhere. Each game in a tournament normally carries the same weighting. I don't believe that claim at all.

RobertJasiek:

  1. In the process of assessing advantages, we should indeed indentify double entries, if any.
  2. The three aspects grouped together by you here are not the same: The first one discusses what information is contained in DC. Information about a player's wins is contained whereas information about how well the TD has paired him is not contained. The second one discusses whether DC represents well if it is considered a shortcut for a continued tournament. The third is about the degree of directness of the contained information. So both the first and the third are about the type of information contained in DC. This makes these two aspect similar (but not the same).

IanDavis: Regardless of whether or not you consider them to be distinct, nobody has yet produced a convincing argument to label them as advantages.

Discussion of: uses only direct information

RobertJasiek: For a player, direct information is that one concerning his own game results and only them. He is responsible for his performance there. He is not responsible and cannot influence the game results of third players against each other. Therefore it is an advantage that DC measures only direct information; it measures only such information of a player for that he is responsible himself. This is an advantage over tiebreakers like SOSOS that measure indirect information.

IanDavis: Can be rewritten more concisely as An advantage of DirectComparison is that it is DirectComparison , provides no explanation why. A tournament consists of many games. Why is it an advantage to consider only 1 game to determine the winner on tiebreak?

Discussion of: uses only direct information

RobertJasiek: For a player, direct information is that one concerning his own game results and only them. He is responsible for his performance there. He is not responsible and cannot influence the game results of third players against each other. Therefore it is an advantage that DC measures only direct information; it measures only such information of a player for that he is responsible himself. This is an advantage over tiebreakers like SOSOS that measure indirect information.

IanDavis: Can be rewritten more concisely as An advantage of DirectComparison is that it is DirectComparison , provides no explanation why. A tournament consists of many games. Why is it an advantage to consider only 1 game to determine the winner on tiebreak?

RobertJasiek:

  1. Look at the final result list of a typical tournament with 100 players. Assume player P1 wins P15, P15 wins P40, P40 wins P75, P75 wins P100. This chain is from direct to indirect relation of degree 4 (steps in the chain). The P75 wins P100 game says nothing about P1 because its relation to P1 is only very indirect. The game P1 wins P15 says something about P1 because it is direct information.
  2. (Since you don't know or pretend not to know: With empty statements like "An advantage of DirectComparison is that it is DirectComparison" you do not convince anybody.)

IanDavis: I'm glad you agree with me that as an advantage, saying that DirectComparison is DirectComparison is not convincing. You aren't convincing either. Why say less information is more meaningful? You still fail to produce the argument. Maybe there is none.

Discussion of: does not contain any noise of useless information.

RobertJasiek: Numerically, this is the complement of significance. So as advantage "low noise" is a double entry of "high significance". We may be interested in which types of noise exist (like early rounds impact or pairing effects), but such study is not (so much) about degree of advantage.

IanDavis: 1 point is not a data set, it has infinite noise or it is a noise. This was already pointed out again and again how many times do we have to explain it to you Robert? How many times?


Other Remarks

RobertJasiek: Can people please stay to facts? It is not helpful to read emotional remarks here or in the summaries of change. In particular, "Wiki-fy", "childish", or "97 errors" are counter-productive language. Use arguments and reasons!

Bass: If you say that merely linking to the said arguments elsewhere is "not very convincing", you should not act too surprised when you do not get the most polite language from the persons having to copy those arguments for your viewing pleasure. Also notice that your wish was obeyed, and the arguments are, indeed, now on this page. This was only because you are in a position to fix the EGF's recommendations. Any other person adding baseless arguments on a page, and then declaring linking to be an inacceptable means of presenting arguments for their removal, would have received a very different treatment.


Bass: This is getting much too tedious for me, so here's my suggestion: I'll spare my nerves and free time by staying off this discussion here, and Robert will reopen the discussion within the rules commission, whose job this actually is. If he can find even one other member in support of the current recommendation's implications about the Direct Comparison tie breaker, I'll come back here and apologize.

Do we have a deal?

RobertJasiek: Now that everybody's first opinion has been stated, time is ripe for a WME of the discussion subpage. I hope that I will find time to do that. The WME can then be reviewed. Afterwards we can WME the parent page.

Study of tiebreakers is tedious indeed because relatively little research about tournament systems, aims, and tiebreakers is available so far. This cannot be helped. I had to make similar experiences of temporarily leaving discussions because of the time required for them. But over the years improvements are being made. Not long ago, nobody would even have known what an aim was or that DC had (non-)iterative variants...


Moved from SOS/Discussion:

  • In MacMahon / Swiss, Direct Comparison can be applied only in some cases but when distrinction of the first player is the aim, then exactly 2 tied players is not scarce.
  • Is Coin better than direct comparison? Yes, if being applicable the more often is the only criterion of evaluation. No, if the degree of meaning when both tiebreakers are applicable is the only criterion of evaluation. (One should not want to suggest to search for greater meaning in averages of Coin compared to averages of direct comparison, when also weighing its applicability, because such a weight is chosen and interpreted somewhat arbitrarily necessarily. Counting frequencies of applicability is a weight but does not weigh degree of meaning yet, so finding a more appropriate weight becomes guesswork.)
  • Direct comparison has not be shown to be broken but just shown that also this specific tiebreaker does not cancel loops of winning relations like A>B>C>A. No tiebreaker can cancel such loops since they exist. Only within a pure knockout, we get a clean tree relation of wins between the players. (If determination of exactly 1 player is the goal, then a knockout does not need any tiebreaker.)

Discussion of the page on 2009-12-26

RobertJasiek:

  • The current list of advantages and disadvantages is nothing but a joke. It is better omitted.
  • The "note this [EGF] description is very hard to understand." might as well be the opposite, which is my opinion.
  • If somebody comments on [ext] Quality of Direct Comparison that this and related pages on the site contained both incorrect and misleading mathematical statements, then he has to explain a) why they were incorrect and b) why they were misleading. Otherwise the somebody is merely creating desinformation. Desinformation is absolutely improper on a wiki.
  • What is the purpose of a "Jasiek about Direct Comparison/Errata" page? We already have this discussion page. It is not me who has to start explanations but isd because he makes the claims. I cannot defend as long as isd does not even state what he criticises.

tapir: Just an invitation to isd to tell about it. Dropped the reference. Please review the current list of advantages and disadvantages.

isd: As far as I am concerned Robert is not interested in rationally discussing tiebreakers. He merely repeats the same nonesense like a parrot. See above or below.

RobertJasiek: Short review of current (dis)advantages:

  • As discussed previously, there are more (dis)advantages that are missing.
  • Even in case of several tied players, DC is easier calculated than, e.g., SOS. Therefore the stated advantage "If the case of two tied players, it is a simple tiebreaker that is easily understood by players." is an absolute one but the additional relative advantage for a more general case of 2 or more players is missing.
  • If a so called advantage is "(Also a disadvantage)", then it is not an advantage and not a disadvantage but a neutral aspect. Currently the neutral aspect section is missing while there are neutral aspects.
  • It is false that "It is only applicable for small tied groups, where all players have played all others within the group.". It is correct that its application gives non-trivial, tie-breaking values only if all players of a group of tied players have played an equal number of games against each other. The false disadvantage must be corrected or removed. A Wiki is not a place for misinformation.
  • "not as simple" as what? Not as simple as in case of exactly two tied players. Still simpler to calculate than, e.g., SOS. Therefore the disadvantage must be made more precise to get some useful meaning at all.
  • It is incorrect that "It does not consider performance over the whole tournament, instead considering only some of the games that have been played." It is correct that "In application for the final placements after most tournaments, it does not consider performance over the whole tournament, instead considering only some of the games that have been played." The incorrect ought to be replaced by the correct.

tapir:

  • Giving a property of the method twice as advantage and disadvantage is equivalent to calling it a neutral aspect. Just a matter of presentation.
  • I don't agree that you can calculate a meaningful "direct comparison" if people didn't play each other. (I doubt the case of everyone having played an equal number of games but not against everyone else is of much relevance in practice. It is impossible with 2 or 3 players. In case of 4 players with 2 games each it may work, but leaving quite a lot of ties behind. 5+ player ties are not very common afaik.)
  • "It is incorrect that "It does not consider performance over the whole tournament, instead considering only some of the games that have been played." It is correct that "In application for the final placements after most tournaments, it does not consider performance over the whole tournament, instead considering only some of the games that have been played." The incorrect ought to be replaced by the correct." That is, basically you agree, but you have the impression that the wording sounds too negative?

RobertJasiek:

  • tapir, I agree that direct comparion is not meaningful if it creates trivial values only. Not being meaningful does not mean though that the trivial values could not be calculated. They can be calculated - even trivially. (It has a purpose to define the trivial values to be 0 instead of "non-existing": That does not suggest that all players would have meaningless values; rather DC can still be meaningfully applied to those mutually tied players have played the same number of games against each other.)
  • tapir, how is it impossible for 2 or 3 players? 2 players: a 1-round tournament. (Yes, this is not the norm, but it is possible. In fact, I have played in tournament stages consisting of exactly one round.) 3 players: A beats both B and C, B beats C. (I.e., when it is not a 3-way-tie.)
  • About your last point: I disagree because without the additional "In application for the final placements after most tournaments" the remainder is false. Since you do not understand, let me explain: In some (scarce) tournaments, DC considers performance of the entire tournament. Example: 1-round tournament with exactly 2 players, A beats B. Then the direct comparison covers all games of the tournament. (Sure, I know, it is not your ordinary tournament. Such trivial tournaments do exist though, at least as stages of multi-stage tournaments. Since they do exist, it does not apply to all tournaments that "DC does not consider performance over the whole tournament, instead considering only some of the games that have been played.".)

tapir: Misunderstanding!!!

  • Reg. 2nd point. With two or three tied players direct comparison is only meaningful if all players played each other, then direct comparison is fine. But two players with no game, or three players with only two with a game against each other are cases where DC is meaningless. For four players with each two games (instead of three) there may be an exception, but 5+ players ties are more or less irrelevant anyway. That is, to use direct comparison only if all players tied played against each other is not such a bad proposal.
  • I got your point, but it is rather academic.

RobertJasiek: Generally DC is more often useful in round-robin than in Swiss/McMahon because in round-robin always all tied players have had an equal number of games against each other. In round-robin, 5+ tied players is not as scarce as you might guess.

tapir: I agree. Though the discussion above was about ties in McMahon - especially when not all tied players played each other, because you disagreed to the inclusion of this condition as a disadvantage. (# It is false that "It is only applicable for small tied groups, where all players have played all others within the group.". It is correct that its application gives non-trivial, tie-breaking values only if all players of a group of tied players have played an equal number of games against each other. The false disadvantage must be corrected or removed. A Wiki is not a place for misinformation.)

Luck, Noise etc

isd:It is said that direct comparison (DC) cannot be subject to any error. In terms of prediction of strength this is simply untrue. In terms of pairings can this be true? This claim presumably is made because DC does not look at the pairings or any aspect of the tournament. However, can we safely ignore the tournament? The one pairing left is still demonstrably subject to what one may call noise, pairing lottery, or luck. Adam faces 9 tough games in a row and is drawn against Bob, who has had 8 easy games in a row. Adam is physically and mentally tired, Bob is fresh. If only they had been paired sooner then Adam could have won. Likewise, Adam may have been sick on day 1 but by day 2 was feeling fine. If only they had been paired on day 2. These are two examples of how the structure of the tournament, allied to luck, may have altered the result. Of course other tiebreakers would also be hit by these. Anyway, here we have luck from a statistical sense and a structural sense.

tapir: Can you please specify who made this claim? Not me. ("It often breaks ties, sometimes arbitrary")

isd: Of course I can, earlier in the debate Mr R.Jasiek made this claim. His way of expressing this has varied, for instance In other words, Direct Comparison values do not contain any noise


DirectComparison/Discussion last edited by isd on March 17, 2010 - 15:38
RecentChanges · StartingPoints · About
Edit page ·Search · Related · Page info · Latest diff
[Welcome to Sensei's Library!]
RecentChanges
StartingPoints
About
RandomPage
Search position
Page history
Latest page diff
Partner sites:
Go Teaching Ladder
Goproblems.com
Login / Prefs
Tools
Sensei's Library