spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Vanderborgh <craigvanderbo...@gmail.com>
Subject Re: Suggested Filesystem Layout for Spark Cluster Node
Date Tue, 15 Oct 2013 20:10:45 GMT
FInally:  how big do the "multiple disks configured as separate
filesystems" that are used for temporary Spark storage need to be?

Thanks,
Craig


On Tue, Oct 15, 2013 at 1:12 PM, Craig Vanderborgh <
craigvanderborgh@gmail.com> wrote:

> In particular: If I make the "SPARK_WORKER_INSTANCES" env variable setting
> in spark-env.sh, will this propagate through Mesos and result in (say) two
> workers per cluster node?
>
> Thanks,
> Craig
>
>
> On Tue, Oct 15, 2013 at 1:07 PM, Craig Vanderborgh <
> craigvanderborgh@gmail.com> wrote:
>
>> Hi Matei,
>>
>> This is helpful but it would be even more so if this documentation could
>> describe how to make these settings correctly in a Spark-on-Mesos
>> environment.  Can you describe the differences for Mesos?
>>
>> Thanks again,
>> Craig
>>
>>
>> On Mon, Oct 14, 2013 at 6:15 PM, Matei Zaharia <matei.zaharia@gmail.com>wrote:
>>
>>> Hi Craig,
>>>
>>> The best configuration is to have multiple disks configured as separate
>>> filesystems (so no RAID), and set the spark.local.dir property, which
>>> configures Spark's scratch space directories, to be a comma-separated list
>>> of directories, one per disk. In 0.8 we've written a bit on how to
>>> configure machines for Spark here:
>>> http://spark.incubator.apache.org/docs/latest/hardware-provisioning.html.
>>> For the filesystem I'd suggest ext3 with noatime set.
>>>
>>> Matei
>>>
>>> On Oct 14, 2013, at 11:28 AM, Craig Vanderborgh <
>>> craigvanderborgh@gmail.com> wrote:
>>>
>>> > Hi All,
>>> >
>>> > We're setting up a new Spark-on-Mesos cluster.  I'd like anyone who is
>>> already done this to suggest a disk partitioning/filesystem layout that has
>>> worked well for them in their cluster deployment.
>>> >
>>> > We are running MapR M3 on the cluster, but only for maprfs.  Our jobs
>>> will be programmed for and run on Spark.
>>> >
>>> > Thanks in advance,
>>> > Craig Vanderborgh
>>>
>>>
>>
>

Mime
View raw message