spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Date Fri, 03 Mar 2017 22:51:37 GMT
Aseem's screenshots clearly show that he has 1 worker with 4 cores, and
that there is an application running that has claimed those 4 cores. It is
hardly surprising, then, that another application will not receive any
resource offers when it tries to start up.

On Fri, Mar 3, 2017 at 2:30 PM, Marco Mistroni <mmistroni@gmail.com> wrote:

> I forgot to attach the jpg. they are in total half GB
> The first shows 4 cores (2 per nodes) ,none in use
> The seconcd shows  1 core per node in use
> The third show 2 cores in use and 2 available, but the second job never
> makes it to the cluster/. Indeed, it only makes to the cluster if i kill
> the other job
>
>
>
>
> On Fri, Mar 3, 2017 at 10:27 PM, Marco Mistroni <mmistroni@gmail.com>
> wrote:
>
>> Hello
>>    i'd lke to disagree to that.
>> Here's my usecase (similar to Aseem)
>>
>> 1 - SEtup a Spark Stndalone Cluster with 2 nodes (2 cores each)
>> 2 - Check resources on the cluster (see  Spark Cluster.jpg)
>>
>> 3- Run a script from node1 with the following command
>>
>>  ./spark-submit   --driver-cores 1 --executor-cores 1
>> /root/pyscripts/dataprocessing_Sample.py  file:///root/pyscripts/tree_a
>> ddhealth.csv
>>
>> 4  -Check status of cluster when submitting 1 job (see SparkCluster 1job)
>>
>> 5  -Run exactly the same script from node2 with the following command
>>
>>      ./spark-submit   --driver-cores 1 --executor-cores 1
>> /root/pyscripts/dataprocessing_Sample.py  file:///root/pyscripts/tree_a
>> ddhealth.csv
>> 6. This job ends up in getting Initial job has not accepted any resources
>> (but you can see from SparkCluster 1 job that only 2 of the cores have been
>> used
>>
>> 7. Check Status of cluster when 2 jobs are running (See Spark Cluster 2
>> job)
>>
>> The script below is a simple script i am running. It reads a csv file
>> provided as input for 6 times at random times and it does not do any magic
>> or tricks
>>
>>
>>
>> Perhaps my spark submit settings are wrong?
>> Perhaps i need to override how i instantiat spark context?
>>
>> I am curious to see , if you have a standalone cluster, if you can
>> reproduce the same problem.
>> When i run it on EMR on Yarn, everything works fine
>>
>> kr
>>  marco
>>
>>
>> from pyspark.sql import SQLContext
>> from random import randint
>> from time import sleep
>> from pyspark.sql.session import SparkSession
>> import logging
>> logger = logging.getLogger(__name__)
>> logger.setLevel(logging.INFO)
>> ch = logging.StreamHandler()
>> logger.addHandler(ch)
>>
>>
>> import sys
>> def dataprocessing(filePath, count, sqlContext):
>>     logger.info( 'Iter count is:%s' , count)
>>     if count == 0:
>>         print 'exiting'
>>     else:
>>         df_traffic_tmp = sqlContext.read.format("csv").
>> option("header",'true').load(filePath)
>>         logger.info( '#############################DataSet has:%s' ,
>> df_traffic_tmp.count())
>>         df_traffic_tmp.repartition(5)
>>         sleepInterval = randint(10,100)
>>         logger.info( '#############################Sleeping for %s' ,
>> sleepInterval)
>>         sleep(sleepInterval)
>>         dataprocessing(filePath, count-1, sqlContext)
>>
>> if __name__ == '__main__':
>>
>>     if len(sys.argv) < 2:
>>         print 'Usage dataProcessingSample <filename>'
>>         sys.exit(0)
>>
>>     filename = sys.argv[-1]
>>     iterations = 6
>>     logger.info('----------------------')
>>     logger.info('Filename:%s', filename)
>>     logger.info('Iterations:%s', iterations )
>>     logger.info('----------------------')
>>
>>     logger.info( '........Starting spark..........Loading from%s for %s
>> iterations' , filename, iterations)
>>     logger.info(  'Starting up....')
>>     sc = SparkSession.builder.appName("DataProcessSimple").getOrCreate()
>>     logger.info ('Initializing sqlContext')
>>     sqlContext = SQLContext(sc)
>>     dataprocessing(filename, iterations, sqlContext)
>>
>>
>>
>>
>> On Fri, Mar 3, 2017 at 4:03 PM, Mark Hamstra <mark@clearstorydata.com>
>> wrote:
>>
>>> Removing dev. This is a basic user question; please don't add noise to
>>> the development list.
>>>
>>> If your jobs are not accepting any resources, then it is almost
>>> certainly because no resource offers are being received. Check the status
>>> of your workers and their reachability from the driver.
>>>
>>> On Fri, Mar 3, 2017 at 1:14 AM, Aseem Bansal <asmbansal2@gmail.com>
>>> wrote:
>>>
>>>> When Initial jobs have not accepted any resources then what all can be
>>>> wrong? Going through stackoverflow and various blogs does not help. Maybe
>>>> need better logging for this? Adding dev
>>>>
>>>> On Thu, Mar 2, 2017 at 5:03 PM, Marco Mistroni <mmistroni@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi
>>>>>  I have found exactly same issue....I even have a script which
>>>>> simulates a random file read.
>>>>> 2 nodes, 4 core. I am submitting code from each node passing max core
>>>>> 1 but one of the programme occupy 2/4 nodes and the other is In waiting
>>>>> state
>>>>> I am creating standalone cluster for SPK 2.0. Can send sample code if
>>>>> someone can help
>>>>> Kr
>>>>>
>>>>> On 2 Mar 2017 11:04 am, "Aseem Bansal" <asmbansal2@gmail.com> wrote:
>>>>>
>>>>> I have been trying to get basic spark cluster up on single machine. 
I
>>>>> know it should be distributed but want to get something running before
I do
>>>>> distributed in a higher environment.
>>>>>
>>>>> So I used sbin/start-master.sh and sbin/start-slave.sh
>>>>>
>>>>> I keep on getting *WARN TaskSchedulerImpl: Initial job has not
>>>>> accepted any resources; check your cluster UI to ensure that workers
are
>>>>> registered and have sufficient resources*
>>>>>
>>>>> I read up and changed /opt/spark-2.1.0-bin-h
>>>>> adoop2.7/conf/spark-defaults.conf to contain this
>>>>>
>>>>> spark.executor.cores               2
>>>>> spark.cores.max                    8
>>>>>
>>>>> I changed /opt/spark-2.1.0-bin-hadoop2.7/conf/spark-env.sh to contain
>>>>>
>>>>> SPARK_WORKER_CORES=4
>>>>>
>>>>> My understanding is that after this spark will use 8 cores in total
>>>>> with the worker using 4 cores and hence being able to support 2 executor
on
>>>>> that worker.
>>>>>
>>>>> But I still keep on getting the same error
>>>>>
>>>>> For my master I have
>>>>> [image: Inline image 1]
>>>>>
>>>>> For my slave I have
>>>>> [image: Inline image 2]
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message