This post originated from an RSS feed registered with Ruby Buzz
by Gabriel Horner.
Original Post: Tags, Trees and Facets, Oh My!
Feed Title: Tagaholic
Feed URL: http://feeds2.feedburner.com/tagaholic
Feed Description: My ruby/rails/knowledge management thoughts
Inspired by wordnet, I initially tried to organize my tags into lexical trees/outlines. This means I tried to organize the tags/words not by what they mean to me but what they mean to everybody. I gave up on this as my tags began to differ to much from what the word actually meant. (Yes, I'll have to revisit this someday when I want to effectively share my tag relationships with others.)
So I began organizing my tags semantically ie what they mean to me. Trying to learn from past pitfalls, I resisted the urge to make mega tag trees and instead made multiple, lean inheritance trees. For an example of an inheritance tree, take this one of my learning_subject tag:
257: learning_subject 258: math 259: science 226: chem 225: bio 224: astro 265: physics 260: philosophy 290: art
Reading from a leaf node to the root: chem (chemistry) is a science is a learning_subject. Of what use is this inheritance tree? Smarter querying and less tag pollution. If I tag an item with physics, I can alternatively get it back using my science or learning_subject tag. Less tag pollution because I don't have to tag every physics item with science and learning_subject as well. (Nothing new for you semantictaggersout there).
The more I organized my tags, the more I noticed that most of my trees were just lists. Why so many lists? I had tag lists of colors, programming languages, music genres, applications ... Were these nascent inheritance trees or facets? Were they serving as attributes of objects? The more I looked around at my tags, the more I saw that I had been forming different objects with tags. I had application objects that had attributes of a programming language, app type, operating system, etc. I had blog post objects that had attributes of learning type, programming language, author, etc. In not treating tags as attribute values of a tag object, I had been cluttering my tags and losing semantic goodness. I had no idea if my tags were referring to an application mentioned in the webpage or the webpage itself. (Food for thought: at what point do facets become complex enough to be treated as an inheritance tree? What's the significance of that?)
So it looks like it's time for OO-tagging. With OO-tagging will come defining relationships between tag objects. I've also been itching to add modifiers between tags and their tagged subjects (either with a preposition or verb). Perhaps I can kill object relationships and modifiers with one coding stone. Then again, maybe W is respected abroad.