Transparent Wiki-integrated Search Engine


Greg at Bloodhound brought it to my attention (via Techmeme to the original Times of London article) - Founder of Wikipedia plans search engine to rival Google. The new search engine project, code named WikiAsari, will allow wiki editors to filter through search results by identifying spam and junk sites.

(Side note: Asari is a Japanese word for "rummaging around", but the word has some slovenly connotations and is used in context with, for example, homeless rummaging around or less than ethical people "searching" for prey... wonder if this will hinder a launch in Japan, a country that prizes their hierarchical vocabulary?...).


Greg posted fine insights into the implications and problems of WikiAsari - briefly: 1) Noteworthy but unknown sites may miss being ranked, 2) Query results are subject to gaming by vested editorial factions, 3) could work well if mashed up with Google results and refined by wiki editors. I agree on all points.

I dug a little deeper at the Wikia website (WikiAsari's developer) and learned that WikiAsari is based on Nutch and Lucene, functioning as an open source search engine platform. The main premise behind WikiAsari's credibility is the transparency of the open source code... attempts to game the search engine would, by assumption, be monitored by the wiki community. Google and their cousins blackbox their search algorithms without the "social search" component... and this will be WikiAsari's initial differential search advantage.

From the Times of London article:

“Essentially, if you consider one of the basic tasks of a search engine, it is to make a decision: ‘this page is good, this page sucks’,” [Wikipedia founder] Mr Wales said. “Computers are notoriously bad at making such judgments, so algorithmic search has to go about it in a roundabout way. “But we have a really great method for doing that ourselves,” he added. “We just look at the page..."

In conclusion, WikiAsari only needs to prove two things: 1) the Nutch search engine platform can deliver competitively relevant results (vis-a-vis Google, Yahoo, Live) which can then be filtered by wiki editors... and 2) the resulting WikiAsari product can then prove its engine delivers far less spam and junk.

The implications of a WikiAsari product for the real estate vertical would be profound because, in theory, the lead generation sites that many Realtors consider intrusive could be filtered out of results.


I discussed the potential evolution of wikis on Zillow Blog last Tuesday as part of the Yankee Blog Swap. WikiAsari fits right in with why I think wikis will continue to "mashup" into new products, and recast themselves as more than just encyclopedic sources like Wikipedia.

Related articles of interest:
Technorati Tags: , , , ,

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
  • No comments exist for this post.
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.