nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ned Rockson" <nrock...@stanford.edu>
Subject InvertLinks logical problem?
Date Tue, 09 Oct 2007 07:02:39 GMT
In the InvertLinks mapper (LinkDb.map()), I'm not sure if there is a
mistake or if I'm misinterpreting it.   It's in the for loop, at line
129 & 134 in the trunk.  Basically, it looks like the mapper outputs
an Inlinks object which is a collection of Inlink objects.  However,
on each loop through it clears out the array of inlinks, repopulates
it and then outputs the Inlinks.  Thus each iteration will have only 0
or one inlinks rather than putting them all in the same object.  Then
in the reducer (LinkDbMerger.reduce()), it uses an iterator to collect
all of the data in each Inlinks object.  Thus, it looks like the
reducer is robust enough to deal with multiple Inlink objects in the
Inlinks container.

Mime
View raw message