spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <tathagata.das1...@gmail.com>
Subject Re: spark.scheduler.pool seems not working in spark streaming
Date Fri, 01 Aug 2014 00:37:29 GMT
Whoa! That worked! I was half afraid it wont, since I hadnt tried it myself.

TD

On Wed, Jul 30, 2014 at 8:32 PM, liuwei <stupidlw@126.com> wrote:
> Hi, Tathagata Das:
>
>       I followed your advice and solved this problem, thank you very much!
>
>
> 在 2014年7月31日,上午3:07,Tathagata Das <tathagata.das1565@gmail.com>
写道:
>
>> This is because setLocalProperty makes all Spark jobs submitted using
>> the current thread belong to the set pool. However, in Spark
>> Streaming, all the jobs are actually launched in the background from a
>> different thread. So this setting does not work. However,  there is a
>> work around. If you are doing any kind of output operations on
>> DStreams, like DStream.foreachRDD(), you can set the property inside
>> that
>>
>> dstream.foreachRDD(rdd =>
>>   rdd.sparkContext.setLocalProperty(...)
>> )
>>
>>
>>
>> On Wed, Jul 30, 2014 at 1:43 AM, liuwei <stupidlw@126.com> wrote:
>>> In my spark streaming program, I set scheduler pool, just as follows:
>>>
>>> val myFairSchedulerFile = “xxx.xml”
>>> val myStreamingPool = “xxx”
>>>
>>> System.setProperty(“spark.scheduler.allocation.file”, myFairSchedulerFile)
>>> val conf = new SparkConf()
>>> val ssc = new StreamingContext(conf, batchInterval)
>>> ssc.sparkContext.setLocalProperty(“spark.scheduler.pool”, myStreamingPool)
>>> ….
>>> ssc.start()
>>> ssc.awaitTermination()
>>>
>>> I submit my spark streaming job in my spark cluster, and I found stage’s pool
name is “default”, it seem ssc.sparkContext.setLocalProperty(“spark.scheduler.pool”,
myStreamingPool) not work.
>
>

Mime
View raw message