seo moves united kingdom seo moves international
UNITED KINGDOM
seo news SEO NEWS

The Anatomy of a Large-Scale Social Search Engine

February 3, 2010, 2:06 am

Screen shot 2010-02-02 at 6.02.56 PM.pngThe folks at Aardvark have posted an ambitious paper over on the 'vark blog. Titled after Brin and Page's original “Anatomy of a Large-Scale Hypertextual Web Search Engine”, the paper presents the Aardvark engine and, in its authors' words: "describes the fundamental differences between the traditional “Library” paradigm of web search — in which answers are found in existing online content — and the new “Village” paradigm of social search — in which answers arise in conversation with the people in your network."

I have read most of the paper, which has been accepted at WWW 2010 (it reminded me of all the search papers I read in preparation for writing The Search), and found a lot worthy of interest.

First, the paper's authors, both of whom have worked at Google, clearly have a sense of potential history here, in that they not only crib Google's original paper's title, they also mirror the first line (substituting "Aardvark" for "Google", of course). Now that's some b*lls. Of course, when Larry and Sergey first presented Google, they couldn't even get their paper accepted (it took three tries, if I recall correctly. Someone should write a book about that...).

Second, it's unusual for a Valley startup to lay out its architecture and technological specs as willingly as Aardvark has. There's a lot of math in here that I couldn't parse even if I had the will to try.

Third, we learn some cool things about how Aardvark works. Check this quote out: "...unlike quality scores like PageRank [13], Aardvark’s quality score aims to measure intimacy rather than authority. And unlike the relevance scores in corpus-based search

Screen shot 2010-02-02 at 5.57.33 PM.png

engines, Aardvark’s relevance score aims to measure a user’s potential to answer a query, rather than a document’s existing capability to answer a query."

Also interesting: " this involves modeling a user as a content- generator, with probabilities indicating the likelihood she will likely respond to questions about given topics. Each topic in a user profile has an associated score, depending upon the confidence appropriate to the source of the topic. In addition, Aardvark learns over time which topics not to send a user questions about..."

There's a lot more like this in the paper, it's worth reading. The authors even did a test of Aardvark results against Google, with the results being something of a push (see the last page for details). Not bad for an upstart service.

Lastly, we learn a lot about the service, thanks to a number of charts, including something about Aardvark's growth, which I had not really anticipated. It's up and to the right, as you can see from the chart.



Source

paypal
merchant certified open source resources sempo executive member semj member
google yahoo msn bing Valid XHTML 1.0 Transitional
seo moves logo

SEO Moves, Inc
19616 Gulf Blvd., #501
Indian Shores, FL 33785
0-808-189-0505

info@seomoves.org



rss feed