spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Davidson <A...@SantaCruzIntegration.com>
Subject Re: A question about Spark Cluster vs Local Mode
Date Thu, 28 Jul 2016 02:17:53 GMT
Hi Ascot

When you run in cluster mode it means your cluster manager will cause your
driver to execute on one of the works in your cluster.

The advantage of this is you can log on to a machine in your cluster and
submit your application and then log out. The application will continue to
run.

Here is part of shell script I use to start a streaming app in cluster mode.
This app has been running for several months now

numCores=2 # must be at least 2 else steaming app will not get any data.
over all we are using 3 cores



# --executor-memory=1G # default is supposed to be 1G. If we did not set we
are seeing 6G

executorMem=1G



$SPARK_ROOT/bin/spark-submit \

--class "com.pws.sparkStreaming.collector.StreamingKafkaCollector" \

--master $MASTER_URL \

--deploy-mode cluster \

--total-executor-cores $numCores \

--executor-memory $executorMem \

$jarPath --clusterMode $*



From:  Ascot Moss <ascot.moss@gmail.com>
Date:  Wednesday, July 27, 2016 at 6:48 PM
To:  "user @spark" <user@spark.apache.org>
Subject:  A question about Spark Cluster vs Local Mode

> Hi,
> 
> If I submit the same job to spark in cluster mode, does it mean in cluster
> mode it will be run in cluster memory pool and it will fail if it runs out of
> cluster's memory?
> 
> 
> --driver-memory 64g \
> 
> --executor-memory 16g \
> 
> Regards



Mime
View raw message