spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Mistroni <mmistr...@gmail.com>
Subject Re: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Date Fri, 03 Mar 2017 22:27:11 GMT
Hello
   i'd lke to disagree to that.
Here's my usecase (similar to Aseem)

1 - SEtup a Spark Stndalone Cluster with 2 nodes (2 cores each)
2 - Check resources on the cluster (see  Spark Cluster.jpg)

3- Run a script from node1 with the following command

 ./spark-submit   --driver-cores 1 --executor-cores 1
/root/pyscripts/dataprocessing_Sample.py
 file:///root/pyscripts/tree_addhealth.csv

4  -Check status of cluster when submitting 1 job (see SparkCluster 1job)

5  -Run exactly the same script from node2 with the following command

     ./spark-submit   --driver-cores 1 --executor-cores 1
/root/pyscripts/dataprocessing_Sample.py
 file:///root/pyscripts/tree_addhealth.csv
6. This job ends up in getting Initial job has not accepted any resources
(but you can see from SparkCluster 1 job that only 2 of the cores have been
used

7. Check Status of cluster when 2 jobs are running (See Spark Cluster 2 job)

The script below is a simple script i am running. It reads a csv file
provided as input for 6 times at random times and it does not do any magic
or tricks



Perhaps my spark submit settings are wrong?
Perhaps i need to override how i instantiat spark context?

I am curious to see , if you have a standalone cluster, if you can
reproduce the same problem.
When i run it on EMR on Yarn, everything works fine

kr
 marco


from pyspark.sql import SQLContext
from random import randint
from time import sleep
from pyspark.sql.session import SparkSession
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
ch = logging.StreamHandler()
logger.addHandler(ch)


import sys
def dataprocessing(filePath, count, sqlContext):
    logger.info( 'Iter count is:%s' , count)
    if count == 0:
        print 'exiting'
    else:
        df_traffic_tmp =
sqlContext.read.format("csv").option("header",'true').load(filePath)
        logger.info( '#############################DataSet has:%s' ,
df_traffic_tmp.count())
        df_traffic_tmp.repartition(5)
        sleepInterval = randint(10,100)
        logger.info( '#############################Sleeping for %s' ,
sleepInterval)
        sleep(sleepInterval)
        dataprocessing(filePath, count-1, sqlContext)

if __name__ == '__main__':

    if len(sys.argv) < 2:
        print 'Usage dataProcessingSample <filename>'
        sys.exit(0)

    filename = sys.argv[-1]
    iterations = 6
    logger.info('----------------------')
    logger.info('Filename:%s', filename)
    logger.info('Iterations:%s', iterations )
    logger.info('----------------------')

    logger.info( '........Starting spark..........Loading from%s for %s
iterations' , filename, iterations)
    logger.info(  'Starting up....')
    sc = SparkSession.builder.appName("DataProcessSimple").getOrCreate()
    logger.info ('Initializing sqlContext')
    sqlContext = SQLContext(sc)
    dataprocessing(filename, iterations, sqlContext)




On Fri, Mar 3, 2017 at 4:03 PM, Mark Hamstra <mark@clearstorydata.com>
wrote:

> Removing dev. This is a basic user question; please don't add noise to the
> development list.
>
> If your jobs are not accepting any resources, then it is almost certainly
> because no resource offers are being received. Check the status of your
> workers and their reachability from the driver.
>
> On Fri, Mar 3, 2017 at 1:14 AM, Aseem Bansal <asmbansal2@gmail.com> wrote:
>
>> When Initial jobs have not accepted any resources then what all can be
>> wrong? Going through stackoverflow and various blogs does not help. Maybe
>> need better logging for this? Adding dev
>>
>> On Thu, Mar 2, 2017 at 5:03 PM, Marco Mistroni <mmistroni@gmail.com>
>> wrote:
>>
>>> Hi
>>>  I have found exactly same issue....I even have a script which simulates
>>> a random file read.
>>> 2 nodes, 4 core. I am submitting code from each node passing max core 1
>>> but one of the programme occupy 2/4 nodes and the other is In waiting state
>>> I am creating standalone cluster for SPK 2.0. Can send sample code if
>>> someone can help
>>> Kr
>>>
>>> On 2 Mar 2017 11:04 am, "Aseem Bansal" <asmbansal2@gmail.com> wrote:
>>>
>>> I have been trying to get basic spark cluster up on single machine.  I
>>> know it should be distributed but want to get something running before I do
>>> distributed in a higher environment.
>>>
>>> So I used sbin/start-master.sh and sbin/start-slave.sh
>>>
>>> I keep on getting *WARN TaskSchedulerImpl: Initial job has not accepted
>>> any resources; check your cluster UI to ensure that workers are registered
>>> and have sufficient resources*
>>>
>>> I read up and changed /opt/spark-2.1.0-bin-h
>>> adoop2.7/conf/spark-defaults.conf to contain this
>>>
>>> spark.executor.cores               2
>>> spark.cores.max                    8
>>>
>>> I changed /opt/spark-2.1.0-bin-hadoop2.7/conf/spark-env.sh to contain
>>>
>>> SPARK_WORKER_CORES=4
>>>
>>> My understanding is that after this spark will use 8 cores in total with
>>> the worker using 4 cores and hence being able to support 2 executor on that
>>> worker.
>>>
>>> But I still keep on getting the same error
>>>
>>> For my master I have
>>> [image: Inline image 1]
>>>
>>> For my slave I have
>>> [image: Inline image 2]
>>>
>>>
>>>
>>
>

Mime
View raw message