spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Mistroni <>
Subject Re: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Date Fri, 03 Mar 2017 22:27:11 GMT
   i'd lke to disagree to that.
Here's my usecase (similar to Aseem)

1 - SEtup a Spark Stndalone Cluster with 2 nodes (2 cores each)
2 - Check resources on the cluster (see  Spark Cluster.jpg)

3- Run a script from node1 with the following command

 ./spark-submit   --driver-cores 1 --executor-cores 1

4  -Check status of cluster when submitting 1 job (see SparkCluster 1job)

5  -Run exactly the same script from node2 with the following command

     ./spark-submit   --driver-cores 1 --executor-cores 1
6. This job ends up in getting Initial job has not accepted any resources
(but you can see from SparkCluster 1 job that only 2 of the cores have been

7. Check Status of cluster when 2 jobs are running (See Spark Cluster 2 job)

The script below is a simple script i am running. It reads a csv file
provided as input for 6 times at random times and it does not do any magic
or tricks

Perhaps my spark submit settings are wrong?
Perhaps i need to override how i instantiat spark context?

I am curious to see , if you have a standalone cluster, if you can
reproduce the same problem.
When i run it on EMR on Yarn, everything works fine


from pyspark.sql import SQLContext
from random import randint
from time import sleep
from pyspark.sql.session import SparkSession
import logging
logger = logging.getLogger(__name__)
ch = logging.StreamHandler()

import sys
def dataprocessing(filePath, count, sqlContext): 'Iter count is:%s' , count)
    if count == 0:
        print 'exiting'
        df_traffic_tmp ="csv").option("header",'true').load(filePath) '#############################DataSet has:%s' ,
        sleepInterval = randint(10,100) '#############################Sleeping for %s' ,
        dataprocessing(filePath, count-1, sqlContext)

if __name__ == '__main__':

    if len(sys.argv) < 2:
        print 'Usage dataProcessingSample <filename>'

    filename = sys.argv[-1]
    iterations = 6'----------------------')'Filename:%s', filename)'Iterations:%s', iterations )'----------------------') '........Starting spark..........Loading from%s for %s
iterations' , filename, iterations)  'Starting up....')
    sc = SparkSession.builder.appName("DataProcessSimple").getOrCreate() ('Initializing sqlContext')
    sqlContext = SQLContext(sc)
    dataprocessing(filename, iterations, sqlContext)

On Fri, Mar 3, 2017 at 4:03 PM, Mark Hamstra <>

> Removing dev. This is a basic user question; please don't add noise to the
> development list.
> If your jobs are not accepting any resources, then it is almost certainly
> because no resource offers are being received. Check the status of your
> workers and their reachability from the driver.
> On Fri, Mar 3, 2017 at 1:14 AM, Aseem Bansal <> wrote:
>> When Initial jobs have not accepted any resources then what all can be
>> wrong? Going through stackoverflow and various blogs does not help. Maybe
>> need better logging for this? Adding dev
>> On Thu, Mar 2, 2017 at 5:03 PM, Marco Mistroni <>
>> wrote:
>>> Hi
>>>  I have found exactly same issue....I even have a script which simulates
>>> a random file read.
>>> 2 nodes, 4 core. I am submitting code from each node passing max core 1
>>> but one of the programme occupy 2/4 nodes and the other is In waiting state
>>> I am creating standalone cluster for SPK 2.0. Can send sample code if
>>> someone can help
>>> Kr
>>> On 2 Mar 2017 11:04 am, "Aseem Bansal" <> wrote:
>>> I have been trying to get basic spark cluster up on single machine.  I
>>> know it should be distributed but want to get something running before I do
>>> distributed in a higher environment.
>>> So I used sbin/ and sbin/
>>> I keep on getting *WARN TaskSchedulerImpl: Initial job has not accepted
>>> any resources; check your cluster UI to ensure that workers are registered
>>> and have sufficient resources*
>>> I read up and changed /opt/spark-2.1.0-bin-h
>>> adoop2.7/conf/spark-defaults.conf to contain this
>>> spark.executor.cores               2
>>> spark.cores.max                    8
>>> I changed /opt/spark-2.1.0-bin-hadoop2.7/conf/ to contain
>>> My understanding is that after this spark will use 8 cores in total with
>>> the worker using 4 cores and hence being able to support 2 executor on that
>>> worker.
>>> But I still keep on getting the same error
>>> For my master I have
>>> [image: Inline image 1]
>>> For my slave I have
>>> [image: Inline image 2]

View raw message