nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Groschupf ...@media-style.com>
Subject Re: dns lookup cache?
Date Wed, 03 Aug 2005 15:05:34 GMT
How you do 'internal' domain caching?
Thanks.
Stefan
Am 03.08.2005 um 16:51 schrieb Jay Pound:

> I've got a fast internal dns cache so nutch wont need one, and it  
> did stop a
> lot of the errors with nutch host not found-timeout, most isp's dns  
> server
> is bogged down allready by client requests, if you dump 10000  
> clients worth
> of dns traffic they can break or not return results so I made my own
> internal dns server cache, the machine a quad xeon 4gb ram uses  
> over 500mb
> of ram just for caching of the domains in memory!!!
> -Jay
>
> ----- Original Message -----
> From: "Stefan Groschupf" <sg@media-style.com>
> To: <nutch-dev@lucene.apache.org>
> Sent: Wednesday, August 03, 2005 4:19 AM
> Subject: dns lookup cache?
>
>
>
>> Hi there,
>> does anyhow nutch cache dns lookups.
>> I found this paper and section 3.7 gives some very interesting
>> information.
>> We notice that our crawlers often crash after a set of unknown host
>> exceptions.
>> We have already one dual cpu box with a 1Gbit network connection
>> running BIND.
>>
>> So I have 2 questions:
>> People think is may java domain lookup may be a bottleneck that
>> crashs the crawlers?
>> Other crawlers have a kind of dns cache would that make sense to
>> introduce it to nutch as well?
>>
>> Thanks for any comments.
>> Stefan
>>
>>
>>
>
>
>
>


Mime
View raw message