Moved from Opening systematic classification.
Proposal: for a set of indexing systems to cover all 'opening positions' (really, positions on certain sub-boards with few stones).
Objective: an index that would make it easier to track down patterns, wherever they occur on SL. A search engine to match patterns may also become available in time, but that would be a complementary mechanism. For example, an index can make one aware of possibilities with which one wasn't familiar, a search engine starts from a pattern one already recognises as significant.
Form of suggestion: that joseki and other patterns should be given codes such as cAZAp, interpreted in this sort of way:
c denotes a corner pattern: there would also be side patterns s on a 19x10 sub-board;
cA indexes corner patterns based on the 4-4 point;
Z is reserved for tenuki, so that we assume alternating play as the default and mark tenuki explicitly;
cAZA would be the 4463 enclosure, cAZB the 4464 enclosure and so on;
letters p, q, r ... in lower case would represent middle game continuations (probes, invasions and so on), meaning that cAZAp can stand for the 3-3 invasion of the enclosure, cAZAq the 3-4 contact play and so on.
Comments here ...
My concern is that development and systematic application of this type of indexing system would tend to make SL incomprehensible to people familiar with the Go literature in general but not familiar with SL. Therefore I believe that we would end up with something more "average user" friendly if we were to concentrate on increasing the linking between existing pages and building more comprehensive index pages using the more familiar terminology used in other books. This is not nearly as compact as Charles' suggestion, but I personally think that it is more likely to make SL appealing to the majority of readers. --DaveSigaty
TDerz Has anyone of above consulted an experienced documentalist, i.e. a librarian? (I am not)
Every proposal has pros and cons.
I foresee here a particular problem with the Z classificator in Charles proposal. This is purely arbitrary for a beginner, who does not know that many amateur high dan players will classify a move as such, AND it might be so for a professional Korean player.
Wouldn't any sequence with two consecutive moves, i.e. B-B and W-W comprise a white, resp. a black tenuki?
If it is not only the pattern which is searched and classified, but also the sequence which leads to it, then I do not understand the need for the move classifier Z.
This is purely arbitrary for a beginner, who does not know that many amateur high dan players will classify a move as such, AND it might be so for a professional Korean player. The information Tenuki Z comes actually from the knowledge of another Joseki, however this _not being present in the position diagram which anyone__ should be able to translate into the query.
Hence these types of Joseki including Z must be classified twice for being retrievable at last.
Any classification system [2] having its cause and purpose that you want to find (information on) josekis, will have to deal with several - mutually exclusive to some extent [1] - desiderata:
Information visualization tools for exploration:
Hence, what's my conclusion? The quintescense is to distinguish between two different goals:
Stick to good old diagrams as entries, when the user is a human. Give some shortcuts (4-4, 3-4, 5-3, 3-3 with follow-up letters).
For the machine searchability I have no best answer.
It always has a relationship with redundancy, relevance and significance. Usually, one cannot optimize them all at once.
(Here comes in my 15 years experience in chemical database searching, which of course has much different problems (recall overflow vs. significance) than searches in Go (there won't be so many entries on Josekis))
Readibility for humans at this stage is most probably not important, as everything can be translated into visual diagrams for humans. Could this hint that the inverse way is also the most suitable: input search query graphically?
BTW, here a very good overview on the Search & Indexation can be found Exhaustivity Specificity Precision Recall which explains much better, what I wanted to express above.
Perhaps I miss the whole point and issue at stake here, however users - I am a user - do not care much about what happens below the surface of a machine, rather want fast access (efficiency) to results.
It might also be that different approaches from above are best in different - mutually exclusive - situations (searches for josekis). However, if someone prefers one approach (search language, e.g. Charles, Doug's, AGA's, Tamsin's) over the other a simple translation tool could enable any user to work in his/her preferred language. Not so specific queries (e.g. including Charles' tenukis Z had to get wildcards *)
[1] as in "You cannot have the cheapest and best car."
[2] BTW, here a very good overview on the Search & Indexation can be found Exhaustivity Specificity Precision Recall which explains much better, what I wanted to express above.