spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <>
Subject Re: Optimal Server Design for Spark
Date Wed, 02 Apr 2014 22:58:25 GMT
Hey Steve,

This configuration sounds pretty good. The one thing I would consider is having more disks,
for two reasons — Spark uses the disks for large shuffles and out-of-core operations, and
often it’s better to run HDFS or your storage system on the same nodes. But whether this
is valuable will depend on whether you plan to do that in your deployment. You should determine
that and go from there.

The amount of cores and RAM are both good — actually with a lot more of these you would
probably want to run multiple Spark workers per node, which is more work to configure. Your
numbers are in line with other deployments.

There’s a provisioning overview with more details at
but what you have sounds fine.


On Apr 2, 2014, at 2:58 PM, Stephen Watt <> wrote:

> Hi Folks
> I'm looking to buy some gear to run Spark. I'm quite well versed in Hadoop Server design
but there does not seem to be much Spark related collateral around infrastructure guidelines
(or at least I haven't been able to find them). My current thinking for server design is something
along these lines.
> - 2 x 10Gbe NICs
> - 128 GB RAM
> - 6 x 1 TB Small Form Factor Disks (2 x RAID 1 Mirror for O/S and Runtimes, 4 x 1TB for
Data Drives)
> - 1 Disk Controller
> - 2 x 2.6 GHz 6 core processors
> If I stick with 1u servers then I lose disk capacity per rack but I get a lot more memory
and CPU capacity per rack. This increases my total cluster memory footprint and it doesn't
seem to make sense to have super dense storage servers because I can't fit all that data on
disk in memory anyways. So at present, my thinking is to go with 1u servers instead of 2u
Servers. Is 128GB RAM per server normal? Do you guys use more or less than that?
> Any feedback would be appreciated
> Regards
> Steve Watt

View raw message