Here is a suggestion:
If someone were to create a zip file (or similar) consisting only of the sgfs of the diagrams on this site, with descriptive filenames (for instance, the second diagram on kyu exercise 15/solution might be called something like "kyuexercise15::solution::002.sgf"), then it would be possible to download the sgfs and use kombilo or similar to determine, say, whether a particular problem has been posted, or where a fuseki pattern is mentioned...
It wouldn't be quite as nice as having an interactive pattern search built into the web site, but it might be much easier to implement.
What do people think?
Could be a good idea, depending if people would really use it. I agree that a built-in search would be much nicer, but until I find (or am pointed to) a good algorithm for search, I am unwilling to implement it on the server side. Kombilo's algorithm does not cut it. It is too slow - I cannot have server processes run for 15 seconds for a single query. For an earlier discussion on the topic see search algorithms and pages linked from it.
Yes, I don't really expect to see a built-in search any time soon, because it would be a big load on the server, and maybe a lot of work for whoever has to implement it. That's why I'm looking for alternatives.
I for one would use it. I often find myself thinking "yes, I saw that shape discussed on SL somewhere, but now I don't know how to find it..." But of course you need to know that more than one person is interested before you start working on something.
It seems from discussion elsewhere that quite a few people would like a pattern search feature. Is it possible to implement my suggestion above as a makeshift until someone comes up with a better way?
I think it is worthwhile in a limited way. For instance, I have go books with an index to certain selected positions, often with no move numbers. Such indices could be constructed as regular pages on SL, and maybe that is a first step. You don't find all diagrams, and you don't enter a pattern to be matched, but it still is quite serviceable.
Also, constructing such pages will get us started on the idea of how to encode such diagrams, because we have to organize them, not just put them together randomly. So our experience could lead eventually to a good, searchable database.
Bob McGuigan: I think the organization problem might be non-trivial. For example, in joseki dictionaries which use diagram indices, the organization is usually by sequence of moves, not some global shape definition. But some sort of separate diagram index with links to pages seems useful. I've gotten used to scanning diagram indices in various "dictionary-style" books so it doesn't seem too inefficient.
Bob McGuigan: Would it speed up the searches to have a separate file of diagrams maintained on the server so that searches wouldn't have to distinguish diagrams from general text? If so, it would take some time to build the file of currently existing diagrams but in the future diagrams could be entered automatically as they are created in edits.
Not being a very active database user I don't know how the pattern search function works on GoBase but it seems fairly fast. Could something like that be adapted to work on a diagram file?
About how many diagrams total are on sensei's?
It seems to me the problem with search is that we want it to find sub-regions of boards.
As another potential work-around, an "exact match" search ought to be fairly fast. E.g., if I'm searching for the below diagram, I can remember it exactly and finding something is better than nothing.
Pattern search is on my agenda for some time now. Since March I read through just about every 2D search algorithm I could lay my hands on. The results are not encouraging.
Therefore, I plan to do the following:
My plan is to do (1) before October. I'd like to do it earlier, but I cannot promise anything. (3) would be nice to have before the year is over.
Yay!
I'd be interested in reading up on the 2-d search algorithms if you feel like posting links and still have them laying around...
I don't know all links but start at citeseer. From there follow cited papers, active bibliography, similar documents, ...
I read through some 50 papers during the last 6 months.
Arno, is it feasible to make thumbnails for index pages such as Gokyo Shumyo Tsumego Series? It's comparatively low tech, but I think it would be helpful.
First half of first step done, see FindPositionPage.