spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Sadowski <jeff.sadow...@gmail.com>
Subject Fwd: multiple pyspark instances simultaneously (same time)
Date Thu, 22 Oct 2015 16:59:28 GMT
On Thu, Oct 22, 2015 at 5:40 AM, Akhil Das <akhil@sigmoidanalytics.com>
wrote:

> Did you read
> https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
>

I did.

I had set the option

spark.scheduler.mode FAIR

in conf/spark-defaults.conf
and
created fairscheduler.xml
with the two pools production and test

and noticed when I start pyspark and run

sc.setLocalProperty("spark.scheduler.pool", null)

does not work it gives me
NameError: name 'null' is not defined

I tried getting in the production pool so I would have FAIR scheduling

sc.setLocalProperty("spark.scheduler.pool", *"production"*)

and


sc.getLocalProperty("spark.scheduler.pool")

shows

u'production'

I also noticed I could join pools that are not created and it shows that I
am in that uncreated pool as if

sc.setLocalProperty("spark.scheduler.pool", *"production"*)

wasn't really doing anything.

and as I was say

it still behaves as if it isn't doing FAIR scheduling if I am in the
production pool

when I start pyspark as a second user and do
 .

sc.setLocalProperty("spark.scheduler.pool", *"production"*)

It still says waiting on the master's status page.

and still gives me

Initial job has not accepted any resources

If I try to do something as that second user.



> Thanks
> Best Regards
>
> On Thu, Oct 15, 2015 at 11:31 PM, jeff.sadowski@gmail.com <
> jeff.sadowski@gmail.com> wrote:
>
>> I am having issues trying to setup spark to run jobs simultaneously.
>>
>> I thought I wanted FAIR scheduling?
>>
>> I used the templated fairscheduler.xml as is when I start pyspark I see
>> the
>> 3 expected pools:
>> production, test, and default
>>
>> when I login as second user and run pyspark
>> I see the expected pools as that user as well
>>
>> when I open up a webbrowser to http://master:8080
>>
>> I see my first user's state is running and my second user's state is
>> waiting
>>
>> so I try putting them both in the production pool which is fair scheduler
>>
>> When I refresh http://master:8080
>>
>> the second user's status is still waiting.
>>
>> If I try to run something as the second user I get
>>
>> "Initial job has not accepted any resources"
>>
>> Maybe fair queuing is not what I want?
>>
>> I'm starting pyspark as follows
>>
>> pyspark --master spark://master:7077
>>
>> I started spark as follows
>>
>> start-all.sh
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/multiple-pyspark-instances-simultaneously-same-time-tp25079.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Mime
View raw message