nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@nutch.org>
Subject Re: [Nutch-dev] incremental crawling
Date Thu, 01 Dec 2005 20:53:14 GMT
Matt Kangas wrote:
> #2 should be a pluggable/hookable parameter. "high-scoring" sounds  like 
> a reasonable default basis for choosing recrawl intervals, but  I'm sure 
> that nearly everyone will think of a way to improve upon  that for their 
> particular system.
> 
> e.g. "high-scoring" ain't gonna cut it for my needs. (0.5 wink ;)

In NUTCH-61, Andrzej has a pluggable FetchSchedule.  That looks like a 
good idea.

http://issues.apache.org/jira/browse/NUTCH-61

Doug

Mime
View raw message