spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: spark.executor.cores
Date Fri, 15 Jul 2016 18:13:31 GMT
Interesting

For some stuff I create an uber jar file and use that against spark-submit.
I have not attempted to start the cluster from through application.


I tend to use a shell program (actually a k-shell) to compile it via maven
or sbt and then run it accordingly. In general you can parameterise
everything for runtime parameters say --driver-memory ${DRIVER_MEMORY} to
practically any other parameter . That way I find it more flexible because
I can submit the jar file and the class in any environment and adjust those
runtime parameters accordingly.  There are certain advantages to using
spark-submit, for example, since driver-memory setting encapsulates the
JVM, you will need to set the amount of driver memory for any non-default
value before starting JVM by providing the value in spark-submit.

I would be keen in hearing the pros and cons of the above approach. I am
sure you programmers (Scala/Java) know much more than me :)

Cheers



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 15 July 2016 at 16:42, Jean Georges Perrin <jgp@jgp.net> wrote:

> lol - young padawan I am and path to knowledge seeking I am...
>
> And on this path I also tried (without luck)...
>
> if (restId == 0) {
> conf = conf.setExecutorEnv("spark.executor.cores", "22");
> } else {
> conf = conf.setExecutorEnv("spark.executor.cores", "2");
> }
>
> and
>
> if (restId == 0) {
> conf.setExecutorEnv("spark.executor.cores", "22");
> } else {
> conf.setExecutorEnv("spark.executor.cores", "2");
> }
>
> the only annoying thing I see is we designed some of the work to be
> handled by the driver/client app and we will have to rethink a bit the
> design of the app for that...
>
>
> On Jul 15, 2016, at 11:34 AM, Daniel Darabos <
> daniel.darabos@lynxanalytics.com> wrote:
>
> Mich's invocation is for starting a Spark application against an already
> running Spark standalone cluster. It will not start the cluster for you.
>
> We used to not use "spark-submit", but we started using it when it solved
> some problem for us. Perhaps that day has also come for you? :)
>
> On Fri, Jul 15, 2016 at 5:14 PM, Jean Georges Perrin <jgp@jgp.net> wrote:
>
>> I don't use submit: I start my standalone cluster and connect to it
>> remotely. Is that a bad practice?
>>
>> I'd like to be able to it dynamically as the system knows whether it
>> needs more or less resources based on its own  context
>>
>> On Jul 15, 2016, at 10:55 AM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> You can also do all this at env or submit time with spark-submit which I
>> believe makes it more flexible than coding in.
>>
>> Example
>>
>> ${SPARK_HOME}/bin/spark-submit \
>>                 --packages com.databricks:spark-csv_2.11:1.3.0 \
>>                 --driver-memory 2G \
>>                 --num-executors 2 \
>>                 --executor-cores 3 \
>>                 --executor-memory 2G \
>>                 --master spark://50.140.197.217:7077 \
>>                 --conf "spark.scheduler.mode=FAIR" \
>>                 --conf
>> "spark.executor.extraJavaOptions=-XX:+PrintGCDetails
>> -XX:+PrintGCTimeStamps" \
>>                 --jars
>> /home/hduser/jars/spark-streaming-kafka-assembly_2.10-1.6.1.jar \
>>                 --class "${FILE_NAME}" \
>>                 --conf "spark.ui.port=${SP}" \
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 15 July 2016 at 13:48, Jean Georges Perrin <jgp@jgp.net> wrote:
>>
>>> Merci Nihed, this is one of the tests I did :( still not working
>>>
>>>
>>>
>>> On Jul 15, 2016, at 8:41 AM, nihed mbarek <nihedmm@gmail.com> wrote:
>>>
>>> can you try with :
>>> SparkConf conf = new SparkConf().setAppName("NC Eatery app").set(
>>> "spark.executor.memory", "4g")
>>> .setMaster("spark://10.0.100.120:7077");
>>> if (restId == 0) {
>>> conf = conf.set("spark.executor.cores", "22");
>>> } else {
>>> conf = conf.set("spark.executor.cores", "2");
>>> }
>>> JavaSparkContext javaSparkContext = new JavaSparkContext(conf);
>>>
>>> On Fri, Jul 15, 2016 at 2:31 PM, Jean Georges Perrin <jgp@jgp.net>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Configuration: standalone cluster, Java, Spark 1.6.2, 24 cores
>>>>
>>>> My process uses all the cores of my server (good), but I am trying to
>>>> limit it so I can actually submit a second job.
>>>>
>>>> I tried
>>>>
>>>> SparkConf conf = new SparkConf().setAppName("NC Eatery app").set(
>>>> "spark.executor.memory", "4g")
>>>> .setMaster("spark://10.0.100.120:7077");
>>>> if (restId == 0) {
>>>> conf = conf.set("spark.executor.cores", "22");
>>>> } else {
>>>> conf = conf.set("spark.executor.cores", "2");
>>>> }
>>>> JavaSparkContext javaSparkContext = new JavaSparkContext(conf);
>>>>
>>>> and
>>>>
>>>> SparkConf conf = new SparkConf().setAppName("NC Eatery app").set(
>>>> "spark.executor.memory", "4g")
>>>> .setMaster("spark://10.0.100.120:7077");
>>>> if (restId == 0) {
>>>> conf.set("spark.executor.cores", "22");
>>>> } else {
>>>> conf.set("spark.executor.cores", "2");
>>>> }
>>>> JavaSparkContext javaSparkContext = new JavaSparkContext(conf);
>>>>
>>>> but it does not seem to take it. Any hint?
>>>>
>>>> jg
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> M'BAREK Med Nihed,
>>> Fedora Ambassador, TUNISIA, Northern Africa
>>> http://www.nihed.com
>>>
>>> <http://tn.linkedin.com/in/nihed>
>>>
>>>
>>>
>>
>>
>
>

Mime
View raw message