nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniele Menozzi <me...@ngi.it>
Subject Re: Problems on Crawling
Date Sat, 17 Sep 2005 10:08:34 GMT
On  11:44:00 17/Sep , Piotr Kosiorowski wrote:
> Yes - depth means in fact - number of interations of 
> generate/fetch/update cycle.

ok, now it's clear :)

> nutch generate - will include already fetched pages in new segment for 
> fetching after some time (I think default is 30 days and you can change 
> it in config file). And if you deduplicate segments the old page would 
> be removed from index.

ok, thank you for the explaination!!

> regards
> Piotr

regards
	Menoz

-- 
		      Free Software Enthusiast
		 Debian Powered Linux User #332564 
		     http://menoz.homelinux.org

Mime
View raw message