nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Performance issues with queue-based fetching
Date Wed, 20 May 2009 00:27:25 GMT
Hi all,

I just posted some performance figures from a test crawl I did using 
an alternative queue-based fetcher (Bixo) at:

http://ken-blog.krugler.org/2009/05/19/performance-problems-with-verticalfocused-web-crawling/

 From this data, and my experience using Nutch for vertical crawls 
previously, I keep wondering if some of the difference in performance 
from the original fetcher to Fetcher2 is due to bugs (basically 
impolite fetching) with the old fetcher.

Has anybody done any testing with the old fetcher to verify that it's 
acting politely, especially near the end of a crawl?

-- Ken
-- 
Ken Krugler
+1 530-210-6378

Mime
View raw message