nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (NUTCH-69) fetcher.threads.per.host ignored
Date Fri, 08 Jul 2005 14:39:10 GMT
     [ http://issues.apache.org/jira/browse/NUTCH-69?page=all ]
     
Andrzej Bialecki  resolved NUTCH-69:
------------------------------------

    Resolution: Invalid

This behaviour is caused by improper configuration. When crawling less hosts than (fetcher
threads / threads per host), some threads will always be blocked. Solution: change configuration
to use less threads, or more threads per host, or increase the max.http.delay so that blocked
threads would wait longer..

> fetcher.threads.per.host ignored
> --------------------------------
>
>          Key: NUTCH-69
>          URL: http://issues.apache.org/jira/browse/NUTCH-69
>      Project: Nutch
>         Type: Bug
>   Components: fetcher
>     Reporter: Matthias Jaekle

>
> Fetcher ignores 'maximum threads per host'.
> If you fetch less domains with multiple threads, some webservers feel attacked or could
not serve you any more.
> So you loose lots of existing pages in your segments.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message