Forum for File Format

are there any binary Go file formats> [#2759]

Back to forum     Back to page

New reply

reply are there any binary Go file formats> (2012-02-15 07:25) [#9265]

Daniel? I am actually currently writing a go engine/teaching tool and because one of the teaching techniques I plan to use in the engine is referencing professional games tsumego problems; (as well as Go problems in general) I have written a basic file description for a binary file which would support branching commenting but with what may be considered an odd structure empty space left for expansion of the file format; sample of GOB (GO BINARY) format header V0.5 Signature (the ascii characters for GO) (2 bytes) file format version number (2 bytes) compression flag (1 byte) (0 indicates no compression and other numbers indicate other compression schemes) symbol table offset 2 bytes symbol table size 2 bytes offset to starting position and board offsets focus size etc (2 bytes) symbol table indicating among other things board size and various break flags with redirect to comments table with redirects there for move variations but is there a pre-existing binary go format?

HermanHiddema: Re: are there any binary Go file formats (2012-02-15 11:31) [#9266]

Is there a specific advantage to this that you don't get from simply compressing SGF?

reply ((no subject)) (2012-02-16 07:33) [#9268]

{Daniel} well among other things binary files offer a few key advantages including random access (IE I can easily jump to any location in the file without having the entire file read into memory and parsed), I can completely ignore separations between values; they can be read faster, binary files almost always use fewer CPU cycles; it is possible to read and write to the file at the same time.

essentially it boils down to a situation where I have a large amount of flexibility and I can get it to work with the program with it running on say a netbook without using much extra overhead unfortunately because of some of the features I am working on the programming language is already enough of a memory hog that I really do not want to add the extra overhead of a parser to the file routine heck the flag values are already inconsistent enough to cause some troubles with random access to the file (on the note about the size uncompressed a single move encoded in a binary format is about 20% the size of the uncompressed SGF without the time delay of decompression involved with a compressed SGF file. So not only does it save with size there it also manages to save time as well.

X Re: ((no subject)) (2012-02-16 23:02) [#9269]

Bass: I do not know of a widely used binary format for go files, everybody seems to have rolled their own. Also, I am afraid you might be creating actual problems while trying to solve a non-problem: A HDD is essentially a sequential access (as opposed to random access) device, so it makes sense to read an entire file to the memory at one go, especially if the files are small, as go game records typically are, so the trade-off in possibly wasted memory is small also. As for the other benefits you list, I don't think they will actually be beneficial for you, except for the bit about saving some CPU cycles and a disc access or two.

As for SGF being a pain to parse, you are quite correct. However, most commented professional game records are going to be in that very format, so there is no way you can avoid it. If you want to preprocess them to a format more suitable for your application, that is of course fine, but unless you include the preprocessor (and thus, the SGF parser) in your application, your user will not be able to easily create his own content for your app.

On the more general points, the processing power and memory available on a netbook (or a smart phone, even) is highly unlikely to be a bottleneck for your program: The memory of any reasonably modern device is likely to be big enough to simultaneously contain every professional go game ever played; the entire GoGoD database up to the year 2009 (in SGF format, with Kombilo index files) is no bigger than 300M. Instead of trying to optimize memory and CPU use, you might want to concentrate your effort on the user interface, which is much more likely to be a cause for problems than raw performance is.

The previous is of course only my view, please ignore any parts that do not happen to coincide with reality :-)

Also, please use punctuation and capital letters to make your text easier to read.

Daniel: Re: ((no subject)) (2012-02-17 01:58) [#9270]

Daniel?:Sorry about the punctuation and capitalization problems; I typed that out quickly between classes. I have considered the UI problems and have dealt with most of the problems but that in and of itself has caused some other problems because since it is also supposed to be a teaching tool I am trying to write code which will analyze the past few dozen games played by the user to try to determine where the player is weak so that it can give the user more problems in the area where the player is weak in the training sessions. This will include code that will focus on the opening mid and end games it will attempt to determine the relative strengths in these areas: Life and Death, reading ladders, territory estimation, endgame, joseki, memory, detect horrible moves, etc. so that it can try to teach players in a way that it does not churn out players who are great in life and death; I have also implemented a few graphical aids such as a smoothed influence map (I tried mono color blocks but the people I asked preferred a smoother look for the influence field representation) for the UI; there is also an optional graphical representation of estimated territory. I am working on other GUI representations of concepts including connection. to be completely honest I cannot think of any problems being caused unless the user decides to activate all of the visualizations if that is done because it does not keep the smoothed representations of the influence function or connection function or others and it would take some time to calculate these values and with the current version of the engine running (I have it translating UGF files right now) with it in contact with a server and trying to load the values from the 20 files (to analyze weaknesses and improvements) and creating the bare bones GUI and all of the optional features it causes a netbook to slow down enough that the server cuts the connection and the generation of the graphics and optional graphics takes a minute for just the main game; I tested it and with a primitive version of the binary file I was able to increase the number of files being simultaneously analyzed to 35 before it dropped the server connection. essentially my philosophy with this engine is hope for the best prepare for the worst and one of the better ways I can prepare for the worst while not sacrificing the quality of the engine. (with both graphics and the AI (on a side note the AI takes precedence over the graphics but the graphics are part of the teaching tool system))

Back to forum     Back to page

New reply

[Welcome to Sensei's Library!]
Search position
Page history
Latest page diff
Partner sites:
Go Teaching Ladder
Login / Prefs
Sensei's Library