nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <ku...@apache.org>
Subject Re: Server suggestion
Date Fri, 24 Jul 2009 13:46:12 GMT


fredericoagent wrote:
> If I want to setup nutch with lets say 400 million urls in the database.
> 
> Is it better to have a 4-5 super fast and loaded servers or have 12-15
> smaller , cheaper servers.

More smaller servers.  Make sure they are energy efficient though and 
have a decent amount of Ram.  If a server goes down, you aren't affected 
as much.

> 
> By superfast I mean cpu is latest quad core or latest six core processor
> with 6 Gigs Ram and 1. or 1.5 TB HD.
> 
> By cheap I mean something like a Xeon quad core 2.26 cpu with 3 Gig Ram and
> 500 Sata HD.
> 
> 
> or if anyone can suggest a better spec ideal 

Our first servers were 1Ghz (Yes really) running hadoop 0.04 way back 
when.  Our first production clusters were core2, 4G ECC, 1 750G hard 
drive.  These days been building i7 8-core, 12G ECC, 4T raid-5 machines 
with up to 8 disks, 2U for around 2200.00 each.  If you are looking for 
a good server builder check out swt.com. They are supermicro resellers 
and build solid machines.

Suggestions.  Don't skimp on the hard drive, do at least 750G or more. 
Price difference is negligible.  Do at least 2G Ram, 4G is better, 8G is 
better than that.  You can get up to 12G on regular motherboards these 
days.  After that it gets much more expensive.  Ao more recent 
processors, such as core2 or i7.  They are more power efficient per 
processing unit.  If you want a really fast machine, do multiple disks 
in a raid-5 format.

Dennis

Mime
View raw message