Month: December, 2005

Tag along on the Web 2.0 train

It is nearing the end of another year and I need to write one last post. I did receive considerable email to my last post on tagging. In this I shall dwell on the concept of tagging in a little more detail.
For the minions who use the internet and are unfamiliar with the concept of tagging, tags are words that are assigned to the webpage or an object of interest. They are supposed to be short, relevant, with correct spelling and ideally are to be a single word for usability’s sake!. The idea is that you add tags to content that interests you, so that you can search them in future and discover content of a similar nature tagged by fellow taggers. That was easy wasn’t it. If you are feeling all gung-ho then let me give you the bad news. It is not as easy as it sounds for the concept is still in beta though you won’t find it mentioned anywhere!

He couldn’t be more wrong. Serendipity as a result of expanding tag communities is a product of this phenomenon. In the days gone by the way into the internet was by typing keywords into search boxes of companies with colourful logos. Tagging has changed all that. If one stumbles on an interesting site then all one has to do is to click the tags accompanying the post to come across a veritable sea of links with the same tag. Whether they are all really relevant to what you expected or wanted to see is a thought for another day! But you can wade through the internet and keep finding many interesting links while searching for whatever started the activity in the first place.

This was a big year for tags. Lets take a look back at the major events in the world of tagged metadata.

Technorati introduces tags in January. Technorati’s tags was the first implementation of tagging. Technorati’s tags are picked up when the blogs associated with them are crawled. This is radically different from how del.icio.us does the tagging wherein the tags are owned by the site.

Yahoo buys Flickr and del.icio.us and starts what is now called the “Web 2.0” and has given rise to a veritable stew of posts on whether their search systems are going to be better than the databases maintained by the arachnids of Google. A definite case of mass arachnophobia. I personally believe that even though Google is having trouble with splogs and link farms and the lot, their concept of organizing the worlds information is likely to yield better search results than Yahoo’s My Web 2.0 launched in June, and no I don’t have a pet spider!. Google has taken up the mantle with tagging. Google now allows tagging of pages, though the tags are private. Google Base allows tagging too. Amazon launched tags for books in November.

It seems like every man and his dog has a tag now doesn’t it? One would be a fool to think that just because the major players have taken up tagging, the whole process would be simple now. There are more flavours of tagging than ever before. My previous argument of non standardization of the tagosphere wreaking havoc on the concept still holds good. 37Signals has an excellent write up on the matter. The fact that there are multiple interfaces is a bit confusing for the end user because it introduces an unnecessary learning curve for a supposedly simple task waiting for mass market appeal. On the side of the sites that make sense of the input, it probably doesn’t matter as whatever format the tags are originally entered. Once the system processes them it makes absolutely no difference whether you entered them with a space, comma or colon. It’s not about incompatible formats, but simply different ways of entering information into systems. From the end users’ perspective it’s irrelevant once the system accepts the data and breaks the string down into individual tags.

Relevance of the tags to the tagged content is a problem and will continue to be so as long as people have different tastes. But the problem waiting to happen is the “tag bomb”, which could be defined as spammers showering everything in sight with irrelevant tags that would show up in search results, hoping that somebody would click on them.

There is still the problem of searching all this data. On one side is the likes of Google with dedicated search engines crawling the net and indexing content and on the other is an army of taggers tagging everything in sight. At last count del.icio.us had about 100K users. Can random users tagging data yield better results than dedicated bots? I am having visions of “The Matrix” now.

Hybernaut.com has an excellent write up on this. I quote:

“Is the reliance on structured taxonomy an achilles heel of the user-fed Directory model?Perhaps the most likely outcome of all this will be a joint solution. If someone had the power to merge the tags collected by Technorati (or one of their peers) with the user-tagged content of Delicious, then they would be able to produce some powerful search results. And since search and syndication appear to be merging all over the place (Technorati ‘watchlists’, PubSub), someone with access to both crawled and user-fed tag databases would be able to produce superior syndication of serial microcontent like news and blog posts as well.”

In the meantime 43 people have tagged this site with the following tags,

I remain.

Happy New Year!


Happy new year to all.

Feeding frenzy

RSS is all the craze now and we have the unified icon to boot. Available in a variety of flavours as choices are good. RSS and ATOM are the ones available now.

Colin D Devroe called for a unified definition of RSS feeds while lambasting Wikipedia for providing a misleading description to novices. Having thought about it, RSS maybe defined as :

“A method off informing the reader of changes to a webpage without actually having to visit the site.Weblogs and news websites are common sources for web feeds, but feeds are also used to deliver structured information ranging from weather data to ”top ten“ lists of hit tunes.”

If you would like to see Wikipedia’s definition then here it is:

“A web feed is a document (often XML-based) which contains content items, often summaries of stories or weblog posts with web links to longer versions. Weblogs and news websites are common sources for web feeds, but feeds are also used to deliver structured information ranging from weather data to ”top ten“ lists of hit tunes. While RSS feed is by far the most common term, the generic ”web feed“ terminology is sometimes used by writers hoping to make the concept clear to novice users, and by advocates of other feed formats.” — Wikipedia.org

What I dont like about this definition is this bit here, “A web feed is a document (often XML-based) which contains content items, often summaries of stories or weblog posts with web links to longer versions”. It looks like I am not the only one, and rightly so as it is a matter of great consternation to me that a good few bloggers use this technology to their own selfish end. If Really Simple Syndication was all about making it easy for faithful readers to access content without wasting time accessing the website, then why would would you want to introduce a further step into the process by just providing a teaser. It’s non(ad)sense.

Its probably the argument that the blogger values the readers’ time so much that he doesn’t want them to waste time reading content that would not interest them and hence they don’t have to follow the teaser to the site. well they would not have subscribed to the feed if it was the odd interesting link, duh!, atleast I don’t anyway.

If you want a faithful following then full text feeds are a must, period. We need more people doing this so that this stops. Another thing I like about RSS is that it goads bloggers to come up with relevant and good material, and if thats not the case then all it requires one click on the unsubscribe button and poof! Democracy at the touch of a button via RSS.

Gollum Browser

Just randomly browsing the Wikipedia entries I came across this rather interesting iteration of a browser called Gollum!
gollum beta

It is being developed by a Harald Hanek, initially for his daughter and now under GPL for us all. In his own words he describes Gollum as such, “Gollum is a Wikipedia Browser for fast and eyefriendly browsing through the free encyclopedia ”Wikipedia“.
Gollum gives you access to nearly all Wikipedias in all languages. Further more Gollum gives you some special features which allow you to easily customize your work with Wikipedia.

In my opinion the interface of Wikipedia is too overloaded and confusing. So let’s get an easy to use interface. Gollum, the intuitive way to the powerful knowledge of Wikipedia.”

gollum navigation

Gollum is based on PHP and Javascript using XMLHttp request for communication, better known as Ajax. That means, there is no need for databases and the code is ready for PHP5. Therefore, the client is only required to use a browser like Firefox, MS Internet Explorer, Netscape or Safari with activated Javascript. Safari has yet to be tested according to the website but it works perfectly fine for me.

As you can see the navigation is nice and easy and the content is displayed in a very readable format. It loads pretty fast too, and has good localizations.

It is soon to be available as a beta download.

