nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <ku...@apache.org>
Subject Re: Ranking & Scoring Algorithm Pseudocode
Date Mon, 01 Jun 2009 01:27:45 GMT
There isn't any pseudocode for this.  The code for the main algorithm is 
in the LinkRank class.  It is similar in nature to PageRank except it 
has the ability to filter reciprocal links.  If the Link Loops program 
is run it also has the ability to filter out link cycles, but that 
program is O(n) running time so not very efficient.

The LinkRank class is just a single score factor though, the setup of 
the new indexing system allows multiple factors to be combined where the 
LinkRank may be only a single factor in that.

If looking for how the algorithm works I suggest looking at the early 
PageRank algorithm papers.  Here are some links which you may find useful:

http://en.wikipedia.org/wiki/PageRank
http://www.ianrogers.net/google-page-rank/


Dennis

atencorps wrote:
> Hi,
> 
> I came across the Ranking & Score system in Nutch 1.0 ( which includes the
> webgraph, linkrank etc).
> 
> My question is , where can I find the pseudocode for the Ranking & Scoring
> Algorithm/System in place in Nutch 1.0 ?
> 
> Thanks 
> 

Mime
View raw message