Scoring Estimator Considered Harmful/ Discussion

Rakshasa: It got too cluttered with discussion. I'd suggest this page focus on SE's effects on your go related skills, and we assume the SE correctly valued thickness, l&d and stuff like that. And if someone wants to discuss things a particular SE does wrong they should make a new page.


(anonymous): Another important addition to any score estimator would be an assessment of the reliability of the estimated score. If you can get a reliable confidence estimation from the estimator as well, that could in some cases be more worth than to improve the overall absolute accuracy of the estimator.


DougRidgway: Another question: what would make the score estimator more useful? More interactive, would be my vote. Let me fix the group statuses (useful for when it gets them wrong, or for quick what-if scenarios) and fudge the boundaries of implied influence and territory. This stuff requires go knowledge, and it's what humans do well. Let the computer calculate the area of multiple complex irregular shapes to a fraction of a percent, that's what computers do well. The combination could be quite powerful, I think.

aLegendWai: One thing computer does well is **speed**. It can estimate score in 1 second. So it can make one estimation for each move. Amazing!


MarkD: This discussion leads us to the question: How to write an accurate scoring algorithm? Maybe that's worth a new page to discuss it.

dnerra: It's about as difficult as writing a good Go program. You need to have a life-and-death solver to do it well. And if you can accurately score positions, you can also pretty accurately value moves.

Evand: The only difference between a good score estimator and a good go playing program is the ability to propose moves worth evaluating; writing a metamachine that uses a scoring or playing program and can act as the other is fairly trivial. I have one that I may release at some point if / when I get it cleaned up and if there is interest.

Anonymous: There's a pretty standard argument that shows that writing a score estimator is equally difficult to writing a good computer go player. If you had a good computer go player, it could be used to estimate the value of a position, by playing the game to completion from the position we're trying to evaluate. Conversely, if you had a good score evaluator, you could use that to severely limit the branching factor when searching the game tree, enabling a deep search into the game tree and writing a good computer player.

Chess has a relatively straightforward "score estimator", and usually a bad position is rapidly converted into a material advantage.

Evand: Yep, that's about exactly what my program does. The hard part is getting moves other than the best one suggested by the go playing program in some intelligent fashion; that's currently what I'm working on. The hard part about turning a score estimator into a player is only trying reasonable moves if your score estimator is slow and can't suggest moves. Currently I'm using gnugo to play; it's frequently correct that its best move suggested is better than its second-best move; the trick is finding other moves worth trying.


aLegendWai: It is my little opinion.

Score Estimator is unreliable for opening moves

I agree with crux that the score estimator is a disaster if people use it carelessly (eg try to judge which move is good in the opening by it). A good point made by crux!!

However I would like to point out the judgment of the value of a stone in the opening is a very high dan level question. Even pro cannot tell you the exact value of a move, how could we expect an algorithm can?

A merit of this score estimator is it would show you where the estimator counts and for which party. So we can see its mistakes and use our wisdom to alter the score estimtion (in mind).

[1]

(Even accurate), highest score NOT equal to best move

Beware that Go is a long-sighted game. Even if the score is accurately assessed (PS: the value of thickness/influence is counted in the estimation), it doesn't mean we should/must play this move. Some moves can be a loss of points. They are not for territory nor for influence, but strategical reasons. Strategical value is not counted!

It is true mainly in the endgame. That is my little idea. :D

Mef: This may be a little pedantic, but wouldn't the best move always be the one that earns the most points, since when you are counting a move you compare final results. In theory, couldn't someone/something with incredible computational treat the entire game as endgame and miai count every play by reading through to the end?

aLegendWai: You know, the score estimator is **calculated** the score in the present situation, not **predicting** the final result. (Notice the emphasis). Also it is often that some moves don't have instant effect on the score (in terms of either territory or influence etc.) But it exerts some usefulness in strategical sense. Surely you will gain utimately, but it isn't seen at that moment. So thes moves are losses in short-term, but gains in long-term. The score estimator doesn't count these kinds of things, right?

Mef: I'm not totally sure if score estimator really calculates much of anything (hence its affectionate nickname score randomizer) I was speaking of a theoretically accurate score estimating system, and the idea I was referring to is that the value of a move takes into account its effect on the entire game, not just the next couple moves.

aLegendWai: I just assume the score estimator can calcuate the value territories and influence perfectly. I don't assume it can do other brain work (eg assess your strategic value, your value of sacrifice plan, value of the probe etc.) If this is the super score estimator which can take everything valid into account accurately, then you are absolutely right.

So we agree with each other at heart. But we respond based on different assumptions.

Score Estimator becomes better towards the end

When the game progresses towards the end of the game, its usefulness increases. The score estimator becomes more reliable (assuming the misjudgment of life-and-death status of a group is solved, either by allowing a manual judgment or a better algorithm).

What score estimator can do

I think the best job it can do is to estimate the score in the endgame (or late middle-game). If my mind can process so quickly, it helps me a lot with positional judgement and winning games :P Now if I wish to count, I need somewhat 2-3mins to do a rather comprehensive estimation each time. It would be a okay partner in judging endgame moves.

Relating to estimate the usefulness of a move (eg in the opening, or in a middle game), it is rather pointless to rely on socre estimator. First, rarely accurate (eg the crude estimation of influence value). Eg: 4-4 point is safely Second, even if accurate, it doesn't always indicate it is the best move. Click on [1] for details.

If you wish to use it in the opening, you should understand what the algorithm is talking about. It may not be a perfect analogy. But just give you an idea about its real meaning.

The program is a employee. It is making a proposal, telling you its plan. It expects we can get such areas of territories. But It is you to make the plan realized.

You are the manager. It is you who is responsible to:

  • assess the proposal - You have to read through the plan and judge if the plans are feasible. Say you realise some areas are barren. You should abandon its plan and lower the expectation.
  • realise the plan - Remember the territories potential only. Haven't cashed yet. It is you to secure the territories.

Eg:

[Diagram]

4-4 play

The employee plans to get all territories in the corner. However players should know the possibility of 3-3 invasion. How to make it real? Play a stone to strengthen the corner.

[Diagram]

4-4 play

Now we really secure territories. The plan is realised.

How to improve the score estimator

Instead of relying on the program, it is better if more manual order can be assigned to the program, or have more control on it (in short term at least). Some suggestions:

  • allow assigning the life-and-death status (and seki too) of a group
  • distinguish between potential and real territories
  • allow to draw the influence boundary
  • weakness markers. You can mark weaknesses in your shape. So you may adjust the score according to weaknesses you have in your shape.
  • more preference and settings available to fit your needs

Mef: This may be beating a dead horse since time and time again, it has been said that no work will be done on score estimator.I think whoever wrote the score estimator was trying to get something that gave accurate relative scores, as opposed to accurate determining the status of every point at a given instant. That's why things like the 4-4 are claimed as territory when they are not. If white invades the 4-4 black will still make points, they will just be on the side, so score estimator is simply accounting for the fact that, odds are, points will be made off of the 4-4.

Dead Group can gain territories

But does anyone notice the obvious problem in this algorithm? Very often when the computer thinks one group is dead, although it indicates points gained of the dead stones, it strangely states that the possible surrounded areas belong to the dead group.

Mef: I think this happens when score estimator is unsure of the status of the group so it returns a more or less average result by giving one side the stones and one side the territory, from my experience this actually makes it's counts closer to the real count.

aLegendWai: The situation is the score estimator mark the groups as dead (So I gain points on all the stones of the group I captured). Strangely, their surrounded areas are not counted. It belongs to the dead group.

It sometimes occurs. Try it and you will be surprised. :P

Mef: I know exactly what you are referring to and my comment still stands, as far as suprising me with score estimator, it won't happen, because I have a couple sgfs where all that was done was mess score estimator up. For instance, a solid block of 12 black stones in the corner surrounded by white is alive, I have seen a move made by black, kill the black shimari across the board...

[Diagram]

SE says Black is alive

[Diagram]

More SE Abuse

Here's the next best example I can find off hand: Everything is ok up to move 29, every black stone is alive, but then 30 (W9 in the diagram) kills off every black stone on the board except the upper right shimari. To make matters worse, when black plays 31 (B10) that kills his own shimari, making every black stone on the board dead according to the score estimator.


You are dead, but you deserved some territories

[Diagram]

An artificial example

aLegendWai: This is a simple example which gives you roughly about my situation. All whites are marked dead. But they deserves its [circled point] territories. What an amazing estimation :P


This is a copy of the living page "Scoring Estimator Considered Harmful/ Discussion" at Sensei's Library.
(OC) 2007 the Authors, published under the OpenContent License V1.0.
[Welcome to Sensei's Library!]
StartingPoints
ReferenceSection
About