spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Kozlov <>
Subject Re: Spark on RAID
Date Tue, 08 Mar 2016 16:45:27 GMT
Parallel disk IO?  But the effect should be less noticeable compared to
Hadoop which reads/writes a lot.  Much depends on how often Spark persists
on disk.  Depends on the specifics of the RAID controller as well.

If you write to HDFS as opposed to local file system this may be a big
factor as well.

On Tue, Mar 8, 2016 at 8:34 AM, Eddie Esquivel <>

> Hello All,
> In the Spark documentation under "Hardware Requirements" it very clearly
> states:
> We recommend having *4-8 disks* per node, configured *without* RAID (just
> as separate mount points)
> My question is why not raid? What is the argument\reason for not using
> Raid?
> Thanks!
> -Eddie

Alex Kozlov

View raw message