nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe Noel <christophe.n...@cetic.be>
Subject Re: Starting a non-profit organisation running Nutch with a thousand or more sponsored servers
Date Thu, 17 Mar 2005 09:05:25 GMT
Stefan Groschupf,

Thanks for this clear summarize of bandwith requirements. This could be 
great you to approximate server requirement for that case of crawling.

To fully make profit of a 100 MBit bandwith, how much RAM & how much 
threads should we have ? What kind of server would be more efficient ?

Christophe.

> Stefan Groschupf wrote:
>
>> Lets do some calculation:
>> 2 billion pages: (google has 8 billion)
>> 100 kilobytes * 2 000 000 000 = 186.264515 terabytes per Month
>> 1 * 100MBit per Month =   33.1776 TB
>> 186 / 33 = 5.6
>> The cheapest offer for 100 MBit I found was 1000 USD per month.
>> So you pay 6000 USD per month just crawling without any user query.
>> If you _only_ have 1 million queries per day you have another 3 TB 
>> traffic.
>> Math.round(idea) = 20 .000 USD per Month in case all servers are in 
>> same location.
>

Mime
View raw message