spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kushal Chokhani <kushal.chokh...@enlightedinc.com>
Subject spark job not accepting resources from worker
Date Thu, 06 Aug 2015 06:10:31 GMT
Hi

I have a spark/cassandra setup where I am using a spark cassandra java 
connector to query on a table. So far, I have 1 spark master node (2 
cores) and 1 worker node (4 cores). Both of them have following 
spark-env.sh under conf/:

    |#!/usr/bin/env bash
    export SPARK_LOCAL_IP=127.0.0.1
    export SPARK_MASTER_IP="192.168.4.134"
    export SPARK_WORKER_MEMORY=1G
    export SPARK_EXECUTOR_MEMORY=2G

    |

I am using spark 1.4.1 along with cassandra 2.2.0. I have started my 
cassandra/spark setup. Created keyspace and table under cassandra and 
added some rows on table. Now I try to run following spark job using 
spark cassandra java connector:

|     SparkConf conf = new SparkConf();
     conf.setAppName("Testing");
     conf.setMaster("spark://192.168.4.134:7077");
     conf.set("spark.cassandra.connection.host", "192.168.4.129");
     conf.set("spark.logConf", "true");
     conf.set("spark.driver.maxResultSize", "50m");
     conf.set("spark.executor.memory", "200m");
     conf.set("spark.eventLog.enabled", "true");
     conf.set("spark.eventLog.dir", "/tmp/");
     conf.set("spark.executor.extraClassPath", "/home/enlighted/ebd.jar");
     conf.set("spark.cores.max", "1");
     JavaSparkContext sc = new JavaSparkContext(conf);


     JavaRDD<String> cassandraRowsRDD = CassandraJavaUtil.javaFunctions(sc).cassandraTable("testing",
"ec")
     .map(new Function<CassandraRow, String>() {
         private static final long serialVersionUID = -6263533266898869895L;
         @Override
         public String call(CassandraRow cassandraRow) throws Exception {
             return cassandraRow.toString();
         }
     });
     System.out.println("Data as CassandraRows: \n" + StringUtils.join(cassandraRowsRDD.toArray(),
"\n"));
     sc.close();|



This job is stuck with insufficient resources warning. Here are logs:

    1107 [main] INFO org.apache.spark.SparkContext  - Spark configuration:
    spark.app.name=Testing
    spark.cassandra.connection.host=192.168.4.129
    spark.cores.max=1
    spark.driver.maxResultSize=50m
    spark.eventLog.dir=/tmp/
    spark.eventLog.enabled=true
    spark.executor.extraClassPath=/home/enlighted/ebd.jar
    spark.executor.memory=200m
    spark.logConf=true
    spark.master=spark://192.168.4.134:7077
    1121 [main] INFO org.apache.spark.SecurityManager  - Changing view
    acls to: enlighted
    1122 [main] INFO org.apache.spark.SecurityManager  - Changing modify
    acls to: enlighted
    1123 [main] INFO org.apache.spark.SecurityManager  -
    SecurityManager: authentication disabled; ui acls disabled; users
    with view permissions: Set(enlighted); users with modify
    permissions: Set(enlighted)
    1767 [sparkDriver-akka.actor.default-dispatcher-4] INFO
    akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
    1805 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting -
    Starting remoting
    1957 [main] INFO org.apache.spark.util.Utils  - Successfully started
    service 'sparkDriver' on port 54611.
    1958 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting -
    Remoting started; listening on addresses
    :[akka.tcp://sparkDriver@192.168.4.134:54611]
    1977 [main] INFO org.apache.spark.SparkEnv  - Registering
    MapOutputTracker
    1989 [main] INFO org.apache.spark.SparkEnv  - Registering
    BlockManagerMaster
    2007 [main] INFO org.apache.spark.storage.DiskBlockManager  -
    Created local directory at
    /tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/blockmgr-3e3d54e7-16df-4e97-be48-b0c0fa0389e7
    2012 [main] INFO org.apache.spark.storage.MemoryStore  - MemoryStore
    started with capacity 456.0 MB
    2044 [main] INFO org.apache.spark.HttpFileServer  - HTTP File server
    directory is
    /tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/httpd-64b4d92e-cde9-45fb-8b38-edc3cca3933c
    2046 [main] INFO org.apache.spark.HttpServer  - Starting HTTP Server
    2086 [main] INFO org.spark-project.jetty.server.Server  -
    jetty-8.y.z-SNAPSHOT
    2098 [main] INFO org.spark-project.jetty.server.AbstractConnector -
    Started SocketConnector@0.0.0.0:44884
    2099 [main] INFO org.apache.spark.util.Utils  - Successfully started
    service 'HTTP file server' on port 44884.
    2108 [main] INFO org.apache.spark.SparkEnv  - Registering
    OutputCommitCoordinator
    2297 [main] INFO org.spark-project.jetty.server.Server  -
    jetty-8.y.z-SNAPSHOT
    2317 [main] INFO org.spark-project.jetty.server.AbstractConnector -
    Started SelectChannelConnector@0.0.0.0:4040
    2318 [main] INFO org.apache.spark.util.Utils  - Successfully started
    service 'SparkUI' on port 4040.
    2320 [main] INFO org.apache.spark.ui.SparkUI  - Started SparkUI at
    http://192.168.4.134:4040
    2387 [sparkDriver-akka.actor.default-dispatcher-3] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Connecting
    to master akka.tcp://sparkMaster@192.168.4.134:7077/user/Master...
    2662 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
    Connected to Spark cluster with app ID app-20150806054450-0001
    2680 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    added: app-20150806054450-0001/0 on
    worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566) with
    1 cores
    2682 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
    Granted executor ID app-20150806054450-0001/0 on hostPort
    192.168.4.129:45566 with 1 cores, 200.0 MB RAM
    2696 [main] INFO org.apache.spark.util.Utils  - Successfully started
    service 'org.apache.spark.network.netty.NettyBlockTransferService'
    on port 49150.
    2696 [main] INFO
    org.apache.spark.network.netty.NettyBlockTransferService  - Server
    created on 49150
    2700 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/0 is now LOADING
    2706 [main] INFO org.apache.spark.storage.BlockManagerMaster  -
    Trying to register BlockManager
    2708 [sparkDriver-akka.actor.default-dispatcher-17] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/0 is now RUNNING
    2710 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.storage.BlockManagerMasterEndpoint  - Registering
    block manager 192.168.4.134:49150 with 456.0 MB RAM,
    BlockManagerId(driver, 192.168.4.134, 49150)
    2713 [main] INFO org.apache.spark.storage.BlockManagerMaster  -
    Registered BlockManager
    2922 [main] INFO org.apache.spark.scheduler.EventLoggingListener -
    Logging events to file:/tmp/app-20150806054450-0001
    2939 [main] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
    SchedulerBackend is ready for scheduling beginning after reached
    minRegisteredResourcesRatio: 0.0
    3321 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
    host /192.168.4.129:9042 added
    3321 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
    host /192.168.4.130:9042 added
    3322 [main] INFO
    com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
    Added host 192.168.4.130 (DC1)
    3322 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
    host /192.168.4.131:9042 added
    3323 [main] INFO
    com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
    Added host 192.168.4.131 (DC1)
    3323 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
    host /192.168.4.132:9042 added
    3323 [main] INFO
    com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
    Added host 192.168.4.132 (DC1)
    3325 [main] INFO
    com.datastax.spark.connector.cql.CassandraConnector  - Connected to
    Cassandra cluster: enldbcluster
    3881 [main] INFO org.apache.spark.SparkContext  - Starting job:
    toArray at Start.java:85
    3898 [pool-18-thread-1] INFO
    com.datastax.spark.connector.cql.CassandraConnector  - Disconnected
    from Cassandra cluster: enldbcluster
    3901 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Got job 0 (toArray at
    Start.java:85) with 6 output partitions (allowLocal=false)
    3902 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Final stage: ResultStage
    0(toArray at Start.java:85)
    3902 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Parents of final stage:
    List()
    3908 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Missing parents: List()
    3925 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Submitting ResultStage 0
    (MapPartitionsRDD[1] at map at Start.java:77), which has no missing
    parents
    4002 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - ensureFreeSpace(7488) called
    with curMem=0, maxMem=478182113
    4004 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - Block broadcast_0 stored as
    values in memory (estimated size 7.3 KB, free 456.0 MB)
    4013 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - ensureFreeSpace(4015) called
    with curMem=7488, maxMem=478182113
    4013 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - Block broadcast_0_piece0
    stored as bytes in memory (estimated size 3.9 KB, free 456.0 MB)
    4015 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.storage.BlockManagerInfo  - Added
    broadcast_0_piece0 in memory on 192.168.4.134:49150 (size: 3.9 KB,
    free: 456.0 MB)
    4017 [dag-scheduler-event-loop] INFO org.apache.spark.SparkContext 
    - Created broadcast 0 from broadcast at DAGScheduler.scala:874
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.129:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.130:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.131:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.132:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.spark.connector.cql.CassandraConnector  - Connected to
    Cassandra cluster: enldbcluster
    4394 [pool-18-thread-1] INFO
    com.datastax.spark.connector.cql.CassandraConnector  - Disconnected
    from Cassandra cluster: enldbcluster
    4806 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Submitting 6 missing
    tasks from ResultStage 0 (MapPartitionsRDD[1] at map at Start.java:77)
    4807 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.TaskSchedulerImpl  - Adding task set 0.0
    with 6 tasks
    19822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
    Initial job has not accepted any resources; check your cluster UI to
    ensure that workers are registered and have sufficient resources
    34822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
    Initial job has not accepted any resources; check your cluster UI to
    ensure that workers are registered and have sufficient resources
    49822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
    Initial job has not accepted any resources; check your cluster UI to
    ensure that workers are registered and have sufficient resources
    64822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
    Initial job has not accepted any resources; check your cluster UI to
    ensure that workers are registered and have sufficient resources
    79822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
    Initial job has not accepted any resources; check your cluster UI to
    ensure that workers are registered and have sufficient resources
    94822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
    Initial job has not accepted any resources; check your cluster UI to
    ensure that workers are registered and have sufficient resources
    109822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources
    124822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources
    124963 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/0 is now EXITED (Command exited
    with code 1)
    124964 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
    Executor app-20150806054450-0001/0 removed: Command exited with code 1
    124968 [sparkDriver-akka.actor.default-dispatcher-17] ERROR
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
    Asked to remove non-existent executor 0
    124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    added: app-20150806054450-0001/1 on
    worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566) with
    1 cores
    124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
    Granted executor ID app-20150806054450-0001/1 on hostPort
    192.168.4.129:45566 with 1 cores, 200.0 MB RAM
    124975 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/1 is now RUNNING
    125012 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/1 is now LOADING
    139822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources
    154822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources
    169823 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources
    184822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources
    199822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl 
    - Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources

||

Please find attached the spark master UI and pom.xml file with dependencies.

Can anyone please point out what could be an issue here.




Mime
View raw message