nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Pound" <webmas...@poundwebhosting.com>
Subject Re: dns lookup cache?
Date Wed, 03 Aug 2005 14:51:06 GMT
I've got a fast internal dns cache so nutch wont need one, and it did stop a
lot of the errors with nutch host not found-timeout, most isp's dns server
is bogged down allready by client requests, if you dump 10000 clients worth
of dns traffic they can break or not return results so I made my own
internal dns server cache, the machine a quad xeon 4gb ram uses over 500mb
of ram just for caching of the domains in memory!!!
-Jay

----- Original Message ----- 
From: "Stefan Groschupf" <sg@media-style.com>
To: <nutch-dev@lucene.apache.org>
Sent: Wednesday, August 03, 2005 4:19 AM
Subject: dns lookup cache?


> Hi there,
> does anyhow nutch cache dns lookups.
> I found this paper and section 3.7 gives some very interesting
> information.
> We notice that our crawlers often crash after a set of unknown host
> exceptions.
> We have already one dual cpu box with a 1Gbit network connection
> running BIND.
>
> So I have 2 questions:
> People think is may java domain lookup may be a bottleneck that
> crashs the crawlers?
> Other crawlers have a kind of dns cache would that make sense to
> introduce it to nutch as well?
>
> Thanks for any comments.
> Stefan
>
>



Mime
View raw message