Fluke
Fluke is a Monte Carlo Go playing program written in Java by Tim Foden. Development started around 16th January 2008.
I've been meaning to start writing a Go program for some years now. Having recently noticed the success of the Monte Carlo approach, I realised that it wouldn't take anywhere near as much work to get a program off the ground, and this prompted me to start.
Thanks to everyone for all the tools and services like GoGui, GTP, and CGOS. These enabled me to get an engine that can actually play others after only about 1 week of development in the evenings.
Cheers, Tim.
Version information
080309a-10k 1677 (257 games)
080309e-10k 1668 (257 games)
080309f-10k 1648 (257 games)
080309c-10k 1639 (257 games)
080309b-10k 1634 (257 games)
080309d-10k 1620 (257 games)
- UCT-RAVE now. Varying the Cr constant (the one in the RAVE part of the formula). Cu is sqrt(1/5) for all tests.
- (a) Cr = 0.25
- (b) Cr = 0.5
- (c) Cr = 0.75
- (d) Cr = 1.0 -- this is the default/control.
- (e) Cr = 1.25
- (f) Cr = 1.5
080308b-10k 1679 elo (188 games)
080308e-10k 1672 elo (188 games)
080308a-10k 1650 elo (186 games)
080308d-10k 1647 elo (185 games)
080308c-10k 1640 elo (184 games)
- UCT-RAVE now. Varying the Cu constant (the one in the UCB1 (non-rave) part of the formula). Cr is 1.0 for all tests.
- (a) Cu = sqrt(1/3)
- (b) Cu = sqrt(1/4)
- (c) Cu = sqrt(1/5) -- this is the default/control.
- (d) Cu = sqrt(1/6)
- (e) Cu = sqrt(1/7)
080306-10k 1570 elo (269 games)
080306b-10k 1627 elo (189 games)
080306c-10k 1621 elo (200 games)
- This is back to a non-RAVE version, in an attempt to determine why I've not replicated the performance of the 080225 version with newer code. Could be a bug. Could be slippage of the ratings in CGOS. Could be that I've missed something that should be the same, but which is different. More investigation required. (update) 1570 elo looks fine to me. :)
- (b) ??
- (c) ??
080304-10k 1380 elo (302 games)
080304b-10k 1560 elo (299 games)
080304c-10k 1620 elo (299 games)
080304d-10k 1530 elo (282 games)
080304e-10k 1430 elo (276 games)
080304f-10k 1710 elo (252 games) (best version so far, hurrah!)
- ... finally, a test with somewhat of an improvement!
- The first 3 versions were better, but still didn't seem right to me.
- 304 - Random playouts.
- 304b - Atari-aware playouts.
- 304c - Mogo based playouts.
- Need to look at the source to remember what this version did.
- 304d - ??
- These two versions were pretty good. :) I still feel I'm missing something though.
- 304e - Random playouts.
- 304f - Mogo based playouts.
080303-10k 1499 elo (152 games)
080303b-10k 999 elo (151 games)
080303c-10k 1400 elo (152 games)
080303d-10k 1074 elo (148 games)
080301e-10k 1422 elo (103 games)
- RAVE again. Still not right.
080302-10k 1489 elo (184 games)
080302b-10k 1407 elo (146 games)
080302c-10k 1154 elo (143 games)
- More RAVE attempts. Going nowhere fast.
080301rv-10k 1445 elo (206 games)
080301r2-10k 1291 elo (161 games)
080301r3-10k 1473 elo (204 games)
080301r4-10k 1481 elo (197 games)
- Here is a first (failed) attempt to add RAVE to the UCT search.
080228-10k 1480 elo
080228b-10k 1475 elo
080228c-10k 1500 elo
080228d-10k 1450 elo
- These were all basically the same as the 080227/25 versions, except that they used a faster implementation of the 3x3 pattern matcher (still with patterns based on the Mogo paper). Uct formula: winRate + sqrt(ln(totalPlays) / (x * plays)). [n.b. This pattern matcher includes a 'fix' so that the 'any' pattern doesn't match points off the edge of the board -- this may not be the correct behavior.] (update) None of these versions were as strong as the 080225 version, which implies to me that I may have a bug in the new 3x3 pattern code.
- 080228 - uct: x = 6, playout order: extend, pattern, capture, random.
- 080228b - uct: x = 2, playout order: extend, capture, pattern, random.
- 080228c - uct: x = 5, playout order: extend, capture, pattern, random.
- 080228d - uct: x = 2, playout order: extend, pattern, capture, random.
080227b-10k 1500 elo (164 games)
080227-10k 1540 elo (164 games)
- These were both exactly the same as the 080225 version, except for a change to the selectivity of the UCT formula: winRate + sqrt(ln(totalPlays) / (x * plays)).
- 080227 - uct: x = 5.
- 080227b - uct: x = 4.
080226-10k 1420 elo (236 games)
- Removed the pattern matcher from the playouts. This allow the speed to increase to about 5,000/sec in the opening. It looks like the patterns do add quite a lot to the strength (+130 in this instance).
080225-10k 1550 elo (295 games)
- Changed PASS move integer value from board area (e.g. PASS = 81 on 9x9) to always being 0. This version uses UCT search (x = 2) with a Mogo rules based playout (See paper entitled "Modifyication of UCT with Patterns in Monte-Carlo Go"). The playouts are now very slow, down from about 8,000/sec to about 800/sec in the opening on my 1.66GHz Core2 Duo laptop. This was the strongest 10k UCT searcher so far.
080223-10k 080221-10k 1410 elo (408 games)
- Looks like I didn't check in the source to 02-21, but I do have 02-23. Changes here are to use normal UCT search with Atari-aware playouts.
080220-10k 1310 elo (157 games)
- Added a naive AMAF implementation into the UCT search. If there are n moves that will be considered valid for AMAF at a node, then we update the rewards by adding reward / n, and the plays by adding 1 / n. 'Standard' random playouts were used for this test.
080218-10k 1200 elo (270 games)
- Now uses UCT again, but still with 'standard' random playouts. The UCT move choice uses the adjusted formula introduced for 080209. I'd expect this version to be rated about 1,250 elo. (update) Rating was only 1,200... perhaps a bug in the new UTC implementation?
080216a2-10k 1440 elo
- This version defines n, the number of moves to use in AMAF from the playout, as all moves up to (but not including) the 1st capture. This version only seems about 70 elo weaker than the version below, but on average uses at quite a lot fewer moves from the playouts (estimate about 3/5). Fewer moves, but guaranteed better info.
080216am-10k 1510 elo
- There was a bug in my AMAF... I didn't check that the move from AMAF was valid in the root position, so this would have affected how UCB1 selected moves, as the total number of moves would have been incremented with no other increment in the stats for the valid moves. For this version I've fixed this bug. I'll have to wait and see how this will affect the elo rating. (update) It appears to have made a +120 elo difference.
080214-10k 1000 elo
080214am-10k 1390 elo
- This version of Fluke has been extensively re-factored to sort out problems with my initial 'design'. :) I may well have introduced various bugs whilst doing this.
- Here I have gone back to the basic UCB1 (non-tree) search. The am-10k version is using the all moves as first (AMAF) heuristic. It choose to use the first n moves from the playout, where n = pow(num-playout-moves, 0.75). This is in an effort to compare with myCtest-10k and myCtest-10-AMAF, which gained about 420 elo from adding AMAF. My version only seems to have gained 390 elo... perhaps I have a bug in my AMAF implementation? (update) I now wonder if the non AMAF version has a bug, as I did have a bug in the AMAF version (see 080216a-10k), and 390 elo is not so different from 420 elo.
080209-10k 1440 elo
080209-50k 1660 elo
- These are exactly the same code as the plain 080209 version, except for the code that decides on the number of nodes to search. All versions up to this point searched (num-legal-moves x 1000) nodes. E.g. in the opening they would search 81,000 nodes. The two versions given here instead always search a fixed number of nodes... 10k == 10,000 nodes, 50k == 50,000 nodes. It is interesting to observe that the fixed 50,000 nodes version looks only slightly stronger than the 1000*n variable nodes version.
080209 1640 elo - A substantial improvement.
- Changed non-solid eye classification to try to detect false eyes. Non-solid eyes that have 50% or more corners occupied by enemy stones are regarded as false, and become candidates for moves.
- Changed UCB1 move selection confidence bound from sqrt(2 log(m) / n) to sqrt(log(m) / (2 n)). This makes UCT quite a lot more selective.
- Changed something else I can't remember right now.
080208 1430 elo - Going backwards here!
- Changes here include changes for 080207, as I lost the source for 080207 (overwrote it by mistake, doh!).
- Upgraded the atari aware playout to give priority to any moves which extend or capture a group with 1 liberty. The extension will occur whether or not it will increase the number of liberties of the group.
080207 1460 elo
- Source for 080207 is lost, so specific changes for here cannot be devined.
080129 1440 elo
- Initial stab at a UCT driver. It uses the UCB1 driver exactly as is, but in a tree formation. A 120 elo improvement over the plain UCB1 driver version. This isn't as much of an improvement as I expected... perhaps I have a bug here, or have misunderstood something in the implementation of the agorithm?
080127 1320 elo
- Added a UCB1 driver for 1st move selection (for playouts). Chooses moves based on the standard UCB1 selection criteria. e.g. Chooses move with maximum v given by: double v = m_rewards[move] / m_plays[move] + Math.sqrt(2 * Math.log(m_totalPlays) / m_plays[move]).
- Added an "atari aware" playout mode. Currently this only detects moves which cause self-atari, and if so, makes the next move capture the stone with a 50% chance.
080126 1280 elo
- Changed playout move driver to select moves with probability proportional to the cube of the average win rate.
--- below this line from memory only... I no longer have the source code ---
080125 1270 elo
- Changed eye rule for non-solid eyes. Now only fills non-solid eyes if one of the eye-surrounding groups is in atari.
- Changed playout move driver to select moves with probability proportional to the square of the average win rate. e.g. W = win-rate ^ 2;
080124 1030 elo
- Changed eye rule for non-solid eyes. Now only fills non-solid eyes if none of the eye-surrounding groups has less than 3 liberties.
- Added probabalistic move driver for selecting the 1st move of the next playout. Weight for each move W = 1 + 99 * win-rate.
080123 940 elo
Initial version that can actually play go at all (via GTP). :) 1000 Playouts / valid move on the board (thus on an empty 9x9 board it does 81,000 playouts). On 9x9 does about 9,000 playouts / second on my Core2 Duo 1.66 GHz based laptop (raises to about 18,000 /s near the end of the game).
Playouts are random except for:
- Don't play suicide moves.
- Don't fill eyes. Eyes are defined as solid (all surrounding groups have same group-id) or non-solid. Non-solid eyes are only filled if the total number of liberties L is greater than 2. L = sum(group[n].libs - 1) + 1.
- Don't play in position of simple ko.
Fluke 9x9 data from CGOS.
Name ELO Games Date ---------------------------------------------------------- Fluke-080309a-10k 1677 257 2008-03-11 10:52:23 Fluke-080309e-10k 1668 257 2008-03-11 10:52:23 Fluke-080309f-10k 1648 257 2008-03-11 10:52:23 Fluke-080309c-10k 1639 257 2008-03-11 10:52:23 Fluke-080309b-10k 1634 257 2008-03-11 10:52:23 Fluke-080309d-10k 1620 257 2008-03-11 10:52:23
Fluke-080308b-10k 1679 188 2008-03-09 18:18:39 Fluke-080308e-10k 1672 188 2008-03-09 18:18:39 Fluke-080308a-10k 1650 186 2008-03-09 18:18:39 Fluke-080308d-10k 1647 185 2008-03-09 18:18:39 Fluke-080308c-10k 1640 184 2008-03-09 18:18:39
Fluke-080306b-10k 1627 189 2008-03-08 13:07:05 Fluke-080306c-10k 1621 200 2008-03-09 18:18:39 Fluke-080306-10k 1570 269 2008-03-08 13:07:05
Fluke-080304-10k 1383 302 2008-03-06 19:05:37 Fluke-080304b-10k 1556 299 2008-03-06 19:05:37 Fluke-080304c-10k 1620 299 2008-03-06 19:05:37 Fluke-080304d-10k 1527 282 2008-03-06 18:46:53 Fluke-080304e-10k 1426 276 2008-03-06 19:05:37 Fluke-080304f-10k 1709 252 2008-03-06 22:28:05
Fluke-080303-10k 1499 152 2008-03-04 19:22:18 Fluke-080303b-10k 999 151 2008-03-04 19:22:18 Fluke-080303c-10k 1400 152 2008-03-04 19:22:18 Fluke-080303d-10k 1074 148 2008-03-04 19:22:18 Fluke-080301e-10k 1422 103 2008-03-04 23:59:49
Fluke-080302-10k 1489 184 2008-03-03 19:49:19 Fluke-080302b-10k 1407 146 2008-03-03 19:49:19 Fluke-080302c-10k 1154 143 2008-03-03 19:49:19
Fluke-080301rv-10k 1445 206 2008-03-02 22:03:55 Fluke-080301r2-10k 1291 161 2008-03-02 16:02:17 Fluke-080301r3-10k 1473 204 2008-03-02 22:03:55 Fluke-080301r4-10k 1481 197 2008-03-02 22:03:55
Fluke-080228-10k 1481 261 2008-03-01 12:50:27 Fluke-080228b-10k 1475 243 2008-03-01 12:50:27 Fluke-080228c-10k 1501 242 2008-03-01 12:50:27 Fluke-080228d-10k 1451 176 2008-03-01 10:25:37
Fluke-080227b-10k 1503 164 2008-02-28 23:22:29 Fluke-080227-10k 1543 164 2008-02-28 21:01:29 Fluke-080226-10k 1424 236 2008-02-27 19:26:01 Fluke-080225-10k 1554 295 2008-02-27 19:07:21
Fluke-080221-10k 1409 408 2008-02-24 07:59:40 Fluke-080220-10k 1313 157 2008-02-21 20:42:58 Fluke-080218-10k 1196 270 2008-02-20 22:25:22
Fluke-080216a2-10k 1442 197 2008-02-17 08:26:26 Fluke-080216am-10k* 1473 381 2008-03-05 07:19:33 Fluke-080216am-10k 1514 236 2008-02-17 08:26:26 Fluke-080214am-10k 1393 277 2008-02-16 16:02:46 Fluke-080214-10k 998 416 2008-02-17 08:26:26 Fluke-080209-50k 1658 161 2008-02-12 17:39:54 Fluke-080209-10k 1437 165 2008-02-12 21:02:14 Fluke-080209 1637 455 2008-02-11 18:16:58 Fluke-080208 1429 152 2008-02-08 23:44:48 Fluke-080207 1456 125 2008-02-08 00:33:23 Fluke-080129 1438 160 2008-01-30 20:27:03 Fluke-080127 1318 223 2008-01-28 21:00:37 Fluke-080126 1278 147 2008-01-27 08:26:20 Fluke-080125 1270 118 2008-01-26 10:34:04 Fluke-080124 1032 120 2008-01-25 18:14:21