nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AJ Chen <cano...@gmail.com>
Subject Re: severe error in fetch
Date Fri, 30 Dec 2005 22:21:15 GMT
This problem is recurring. It happens when fetching
https://www.kodak.com:0/something.  I guess the port number 0 is the cause
of the problem because there is no problem fetching
https://www.kodak.com/anything.  see log entries:

051230 105257 fetching
https://www.kodak.com:0/eknec/PageQuerier.jhtml?pq-path=2/782/2608/2610/4074/7058&pq-locale=en_US&_loopback=1
051230 105305 SEVERE Host connection pool not found,
hostConfig=HostConfiguration[host=https://www.kodak.com]
java.lang.RuntimeException: SEVERE error logged.  Exiting fetcher.

Is it right that some specific port numbers can cause connection pool
problem in httpclient? If yes, I can filter out url containing these trouble
ports before httpclient is fixed.

Thanks,
AJ

On 12/26/05, Andrzej Bialecki <ab@getopt.org> wrote:
>
> AJ Chen wrote:
>
> >Stefan,
> >Here is the trace in my log.  My SSFetcher (for site-specific fetch) is
> the
> >same as nutch Fetcher except that the URLFilters it uses has additional
> >filter based on domain names. Line 363 is
> >        throw new RuntimeException("SEVERE error logged.  Exiting
> >fetcher.");
> >
> >
> >051224 075950 SEVERE Host connection pool not found,
> >hostConfig=HostConfiguration[host=https://www.kodak.com]
> >
> >
>
> This error comes from the httpclient library (you won't get a better
> stacktrace, you need to redefine the java.util.logging properties to get
> more info). I'm in the process of upgrading to the latest release, but
> it's trivial, you can try it yourself. Hopefully this should solve the
> issue.
>
> --
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message