nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Howie Wang" <howie_w...@hotmail.com>
Subject RE: fetcher error
Date Mon, 20 Jun 2005 14:51:04 GMT
If you're doing only a single site or a few sites at a time,
the multiple fetcher threads might be blocking each other
out. By default, it will send out 10 threads, and if they all
hit the same server, 1 thread will fetch and the other 9
will wait up to a certain timeout for that first threa to finish.

To see if this is your problem, try turning the number of
threads to 1 and see if it helps.

>Thanks for help. Fetcher get stuck on some pages when
>i am doing intranet crawl and i tested on many
>websites.
>
>I tried the setting you suggested before but most of
>the time fetchers dies and i am unable to fetch
>websites for my intranet crawl.It fetches few pages
>from website then throw the error.It is not only
>problem with one website but it is happening for many
>sites i tested.
>
>Thanks.
>
>Kashif
>
>--- Howie Wang <howie_wang@hotmail.com> wrote:
>
> > That just means the site is not responding. You can
> > try to
> > give it more time by setting http.timeout to
> > something
> > larger in your nutch-default.xml.  You can also try
> > increasing the number of retries in the same file.
> >
> > >I am doing intranet crawl but keep getting this
> > error
> > >and after few of same errors my fetcher dies and
> > fetch
> > >no more
> > >
> > >Error is :
> > >
> > >
> >
> >org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:135)
> > >050619 144703 fetch of http://espn.go.com failed
> > with:
> > >java.lang.Exception:
> > >org.apache.nutch.protocol.RetryLater: Exceeded
> > >http.max.delays: retry later.
> > >
> > >
> > >The main issue i think is "Exceeded
> > http.max.delays:
> > >retry later"
> > >
> > >Thanks
> > >
> > >Kashif.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >____________________________________________________
> > >Yahoo! Sports
> > >Rekindle the Rivalries. Sign up for Fantasy
> > Football
> > >http://football.fantasysports.yahoo.com
> >
> >
> >
>
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around
>http://mail.yahoo.com



Mime
View raw message