spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sriks <srikanth.kod...@gmail.com>
Subject Kinesis integration with Spark Streaming in EMR cluster - Output is not showing up
Date Thu, 06 Nov 2014 19:51:32 GMT
Hello,

I am new to spark and trying to run the spark program (bundled as jar) in a
EMR cluster.

In one terminal session, i am loading data into kinesis stream.

In another window, i am trying to run the spark streaming program, and
trying to print out the output.

Whenever i run the spark streaming program, i am continuously seeing the
below INFO messages, but not seeing any output (counters).  When I run the
same program with normal Spark RDDs instead of stream RDDs, i am seeing the
output in hdfs files (taking input by reading from a file instead of reading
it from kinesis stream). Pls also note that i ran with
JavaStreamingContext.awaitTermination and when i use this, it is
continuously outputting the below messages, and later i tried
JavaStreamingContext.awaitTermination.stop to see if i can see the output,
but it is not working.

Any help is really appreciated. Thank you.

Here is the main program:
=================

public JavaDStream<byte[]> getDStream()
    {
        int numShards =
kin_client.describeStream(ken_stream_name).getStreamDescription().getShards().size();

        System.out.println("Number of shards are : " + numShards);

        Duration batchInterval = new Duration(2000);

        /* Setup the Spark config. */
        SparkConf sparkConfig = new SparkConf().setAppName("TestJSON");

        /* Kinesis checkpoint interval. Same as batchInterval for this
example. */
        Duration checkpointInterval = batchInterval;

        /* Setup the StreamingContext */
        jssc = new JavaStreamingContext(sparkConfig, batchInterval);

        dstream = KinesisUtils.createStream(jssc, ken_stream_name, endPoint,
checkpointInterval,
                InitialPositionInStream.LATEST,
StorageLevel.MEMORY_AND_DISK_2());

        System.out.println("DStream count is : " + dstream.count());

        return dstream;

    }

    public void startContext()
    {
        jssc.start();
        /*
         * jssc.stop(); jssc.awaitTermination(); try {
         * java.lang.Thread.sleep(20000); } catch (InterruptedException e) {
         * e.printStackTrace(); }
         * 
         * jssc.stop();
         */

        jssc.stop();

    }

    public static void main(String[] args)
    {
        SentinelQueryNRecoCount qnr = new SentinelQueryNRecoCount();

        JavaDStream<byte[]> data = qnr.getDStream();
        JavaDStream<String> lines = data.map(new GetLines());
        System.out.println("Lines D Stream first row is : " +
lines.count());
        JavaDStream<String> filterdata = lines.filter(new GetFilterData());
        System.out.println("Filtered records are: " + filterdata.count());

        JavaDStream<DuoKey> rdd_records = filterdata.map(new GetRecords());
        System.out.println("Filtered records are: " + filterdata.count());

        System.out.println("RDD records are: " + rdd_records.count());
        JavaPairDStream<String, Tuple2&lt;Integer, Integer>>
pair_map_records = rdd_records.mapToPair(new ProcessMapper());
        System.out.println("Pair Mapper Records Count is : " +
pair_map_records.count());

        JavaPairDStream<String, Tuple2&lt;Integer, Integer>>
result_reduce_records = pair_map_records
                .reduceByKey(new ProcessReducer());

        System.out.println("Result Reduce record count is : " +
result_reduce_records.count());

        result_reduce_records.print();

        // result_reduce_records.saveAsHadoopFiles(prefix, suffix);
        qnr.startContext();
    }

 output below.
=========

Spark assembly has been built with Hive, including Datanucleus jars on
classpath
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/hadoop/.versions/spark-1.1.0/lib/spark-assembly-1.1.0-hadoop2.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Number of shards are : 2
14/11/06 19:34:26 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to
'/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar:/home/hadoop/hive/conf/*').
This is deprecated in Spark 1.0+.

Please instead use:
 - ./spark-submit with --driver-class-path to augment the driver classpath
 - spark.executor.extraClassPath to augment the executor classpath

14/11/06 19:34:26 WARN spark.SparkConf: Setting
'spark.executor.extraClassPath' to
'/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar:/home/hadoop/hive/conf/*'
as a work-around.
14/11/06 19:34:26 WARN spark.SparkConf: Setting
'spark.driver.extraClassPath' to
'/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar:/home/hadoop/hive/conf/*'
as a work-around.
14/11/06 19:34:26 INFO spark.SecurityManager: Changing view acls to: hadoop,
14/11/06 19:34:26 INFO spark.SecurityManager: Changing modify acls to:
hadoop,
14/11/06 19:34:26 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(hadoop, ); users with modify permissions: Set(hadoop, )
14/11/06 19:34:26 INFO slf4j.Slf4jLogger: Slf4jLogger started
14/11/06 19:34:26 INFO Remoting: Starting remoting
14/11/06 19:34:26 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkDriver@ip-172-31-19-47.us-west-2.compute.internal:46907]
14/11/06 19:34:26 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sparkDriver@ip-172-31-19-47.us-west-2.compute.internal:46907]
14/11/06 19:34:27 INFO util.Utils: Successfully started service
'sparkDriver' on port 46907.
14/11/06 19:34:27 INFO spark.SparkEnv: Registering MapOutputTracker
14/11/06 19:34:27 INFO spark.SparkEnv: Registering BlockManagerMaster
14/11/06 19:34:27 INFO storage.DiskBlockManager: Created local directory at
/mnt/spark/spark-local-20141106193427-cced
14/11/06 19:34:27 INFO util.Utils: Successfully started service 'Connection
manager for block manager' on port 33897.
14/11/06 19:34:27 INFO network.ConnectionManager: Bound socket to port 33897
with id =
ConnectionManagerId(ip-172-31-19-47.us-west-2.compute.internal,33897)
14/11/06 19:34:27 INFO storage.MemoryStore: MemoryStore started with
capacity 265.4 MB
14/11/06 19:34:27 INFO storage.BlockManagerMaster: Trying to register
BlockManager
14/11/06 19:34:27 INFO storage.BlockManagerMasterActor: Registering block
manager ip-172-31-19-47.us-west-2.compute.internal:33897 with 265.4 MB RAM
14/11/06 19:34:27 INFO storage.BlockManagerMaster: Registered BlockManager
14/11/06 19:34:27 INFO spark.HttpFileServer: HTTP File server directory is
/tmp/spark-bb64dc83-84f3-432d-8f9e-1393f255e1f7
14/11/06 19:34:27 INFO spark.HttpServer: Starting HTTP Server
14/11/06 19:34:27 INFO server.Server: jetty-8.y.z-SNAPSHOT
14/11/06 19:34:27 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:48941
14/11/06 19:34:27 INFO util.Utils: Successfully started service 'HTTP file
server' on port 48941.
14/11/06 19:34:27 INFO server.Server: jetty-8.y.z-SNAPSHOT
14/11/06 19:34:27 INFO server.AbstractConnector: Started
SelectChannelConnector@0.0.0.0:4040
14/11/06 19:34:27 INFO util.Utils: Successfully started service 'SparkUI' on
port 4040.
14/11/06 19:34:27 INFO ui.SparkUI: Started SparkUI at
http://ip-172-31-19-47.us-west-2.compute.internal:4040
14/11/06 19:34:28 INFO scheduler.EventLoggingListener: Logging events to
hdfs:///spark-logs/testjson-1415302467888
14/11/06 19:34:29 INFO spark.SparkContext: Added JAR
file:/home/hadoop/StreamingSentinel-0.0.1-StreamingSentinel.jar at
http://172.31.19.47:48941/jars/StreamingSentinel-0.0.1-StreamingSentinel.jar
with timestamp 1415302469023
--args is deprecated. Use --arg instead.
14/11/06 19:34:29 INFO client.RMProxy: Connecting to ResourceManager at
/172.31.19.47:9022
14/11/06 19:34:29 INFO yarn.Client: Got cluster metric info from
ResourceManager, number of NodeManagers: 4
14/11/06 19:34:29 INFO yarn.Client: Max mem capabililty of a single resource
in this cluster 3072
14/11/06 19:34:29 INFO yarn.Client: Preparing Local resources
14/11/06 19:34:29 INFO yarn.Client: Uploading
file:/home/hadoop/.versions/spark-1.1.0/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
to
hdfs://172.31.19.47:9000/user/hadoop/.sparkStaging/application_1415287081424_0010/spark-assembly-1.1.0-hadoop2.4.0.jar
14/11/06 19:34:30 INFO yarn.Client: Prepared Local resources
Map(__spark__.jar -> resource { scheme: "hdfs" host: "172.31.19.47" port:
9000 file:
"/user/hadoop/.sparkStaging/application_1415287081424_0010/spark-assembly-1.1.0-hadoop2.4.0.jar"
} size: 138832383 timestamp: 1415302470654 type: FILE visibility: PRIVATE)
14/11/06 19:34:30 INFO yarn.Client: Setting up the launch environment
14/11/06 19:34:30 INFO yarn.Client: Setting up container launch context
14/11/06 19:34:30 INFO yarn.Client: Yarn AM launch context:
14/11/06 19:34:30 INFO yarn.Client:   class:  
org.apache.spark.deploy.yarn.ExecutorLauncher
14/11/06 19:34:30 INFO yarn.Client:   env:     Map(CLASSPATH ->
/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar:/home/hadoop/hive/conf/*:$PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/lib/*:$PWD/__app__.jar:$PWD/*,
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 138832383, SPARK_YARN_STAGING_DIR ->
.sparkStaging/application_1415287081424_0010/,
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE, SPARK_USER -> hadoop,
SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS ->
1415302470654, SPARK_YARN_CACHE_FILES ->
hdfs://172.31.19.47:9000/user/hadoop/.sparkStaging/application_1415287081424_0010/spark-assembly-1.1.0-hadoop2.4.0.jar#__spark__.jar)
14/11/06 19:34:30 INFO yarn.Client:   command: $JAVA_HOME/bin/java -server
-Xmx512m -Djava.io.tmpdir=$PWD/tmp
'-Dspark.tachyonStore.folderName=spark-8e5fbe9f-8f9f-4e4b-a185-66130143a5ed'
'-Dspark.eventLog.enabled=true' '-Dspark.yarn.secondary.jars='
'-Dspark.driver.host=ip-172-31-19-47.us-west-2.compute.internal'
'-Dspark.driver.appUIHistoryAddress=' '-Dspark.app.name=TestJSON'
'-Dspark.driver.appUIAddress=ip-172-31-19-47.us-west-2.compute.internal:4040'
'-Dspark.jars=file:/home/hadoop/StreamingSentinel-0.0.1-StreamingSentinel.jar'
'-Dspark.fileserver.uri=http://172.31.19.47:48941'
'-Dspark.eventLog.dir=hdfs:///spark-logs'
'-Dspark.executor.extraClassPath=/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar:/home/hadoop/hive/conf/*'
'-Dspark.master=yarn-client' '-Dspark.driver.port=46907'
'-Dspark.driver.extraClassPath=/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/.versions/2.4.0/share/hadoop/common/lib/hadoop-lzo.jar:/home/hadoop/hive/conf/*'
org.apache.spark.deploy.yarn.ExecutorLauncher --class 'notused' --jar  null 
--arg  'ip-172-31-19-47.us-west-2.compute.internal:46907' --executor-memory
1024 --executor-cores 1 --num-executors  2 1> <LOG_DIR>/stdout 2>
<LOG_DIR>/stderr
14/11/06 19:34:30 INFO spark.SecurityManager: Changing view acls to: hadoop,
14/11/06 19:34:30 INFO spark.SecurityManager: Changing modify acls to:
hadoop,
14/11/06 19:34:30 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(hadoop, ); users with modify permissions: Set(hadoop, )
14/11/06 19:34:30 INFO yarn.Client: Submitting application to
ResourceManager
14/11/06 19:34:30 INFO impl.YarnClientImpl: Submitted application
application_1415287081424_0010
14/11/06 19:34:30 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:31 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:32 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:33 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:34 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:35 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:36 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:37 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:38 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: -1
         appStartTime: 1415302470803
         yarnAppState: ACCEPTED

14/11/06 19:34:39 INFO cluster.YarnClientSchedulerBackend: Application
report from ASM:
         appMasterRpcPort: 0
         appStartTime: 1415302470803
         yarnAppState: RUNNING

14/11/06 19:34:39 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter.
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter,
PROXY_HOST=172.31.19.47,PROXY_URI_BASE=http://172.31.19.47:9046/proxy/application_1415287081424_0010,
/proxy/application_1415287081424_0010
14/11/06 19:34:39 INFO ui.JettyUtils: Adding filter:
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
14/11/06 19:34:50 INFO cluster.YarnClientSchedulerBackend: Registered
executor:
Actor[akka.tcp://sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:46197/user/Executor#370820828]
with ID 1
14/11/06 19:34:50 INFO util.RackResolver: Resolved
ip-172-31-24-43.us-west-2.compute.internal to /default-rack
14/11/06 19:34:50 INFO cluster.YarnClientSchedulerBackend: Registered
executor:
Actor[akka.tcp://sparkExecutor@ip-172-31-24-44.us-west-2.compute.internal:40753/user/Executor#-865158881]
with ID 2
14/11/06 19:34:50 INFO util.RackResolver: Resolved
ip-172-31-24-44.us-west-2.compute.internal to /default-rack
14/11/06 19:34:50 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend
is ready for scheduling beginning after reached minRegisteredResourcesRatio:
0.8
DStream count is : org.apache.spark.streaming.api.java.JavaDStream@7928b21c
14/11/06 19:34:50 INFO storage.BlockManagerMasterActor: Registering block
manager ip-172-31-24-43.us-west-2.compute.internal:51054 with 534.5 MB RAM
Lines D Stream first row is :
org.apache.spark.streaming.api.java.JavaDStream@75bd8d05
Filtered records are:
org.apache.spark.streaming.api.java.JavaDStream@69c8176d
Filtered records are:
org.apache.spark.streaming.api.java.JavaDStream@3a3e8350
RDD records are: org.apache.spark.streaming.api.java.JavaDStream@2b120064
Pair Mapper Records Count is :
org.apache.spark.streaming.api.java.JavaDStream@375582b9
Result Reduce record count is :
org.apache.spark.streaming.api.java.JavaDStream@9a0b84d
14/11/06 19:34:50 INFO scheduler.ReceiverTracker: ReceiverTracker started
14/11/06 19:34:50 INFO dstream.ForEachDStream: metadataCleanupDelay = -1
14/11/06 19:34:50 INFO dstream.ShuffledDStream: metadataCleanupDelay = -1
14/11/06 19:34:50 INFO dstream.MappedDStream: metadataCleanupDelay = -1
14/11/06 19:34:50 INFO dstream.MappedDStream: metadataCleanupDelay = -1
14/11/06 19:34:50 INFO dstream.FilteredDStream: metadataCleanupDelay = -1
14/11/06 19:34:50 INFO dstream.MappedDStream: metadataCleanupDelay = -1
14/11/06 19:34:50 INFO dstream.PluggableInputDStream: metadataCleanupDelay =
-1
14/11/06 19:34:50 INFO dstream.PluggableInputDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.PluggableInputDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.PluggableInputDStream: Checkpoint interval =
null
14/11/06 19:34:50 INFO dstream.PluggableInputDStream: Remember duration =
2000 ms
14/11/06 19:34:50 INFO dstream.PluggableInputDStream: Initialized and
validated org.apache.spark.streaming.dstream.PluggableInputDStream@7cc403f3
14/11/06 19:34:50 INFO dstream.MappedDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.MappedDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.MappedDStream: Checkpoint interval = null
14/11/06 19:34:50 INFO dstream.MappedDStream: Remember duration = 2000 ms
14/11/06 19:34:50 INFO dstream.MappedDStream: Initialized and validated
org.apache.spark.streaming.dstream.MappedDStream@84f171a
14/11/06 19:34:50 INFO dstream.FilteredDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.FilteredDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.FilteredDStream: Checkpoint interval = null
14/11/06 19:34:50 INFO dstream.FilteredDStream: Remember duration = 2000 ms
14/11/06 19:34:50 INFO dstream.FilteredDStream: Initialized and validated
org.apache.spark.streaming.dstream.FilteredDStream@75b039
14/11/06 19:34:50 INFO dstream.MappedDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.MappedDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.MappedDStream: Checkpoint interval = null
14/11/06 19:34:50 INFO dstream.MappedDStream: Remember duration = 2000 ms
14/11/06 19:34:50 INFO dstream.MappedDStream: Initialized and validated
org.apache.spark.streaming.dstream.MappedDStream@5e18e5c7
14/11/06 19:34:50 INFO dstream.MappedDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.MappedDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.MappedDStream: Checkpoint interval = null
14/11/06 19:34:50 INFO dstream.MappedDStream: Remember duration = 2000 ms
14/11/06 19:34:50 INFO dstream.MappedDStream: Initialized and validated
org.apache.spark.streaming.dstream.MappedDStream@34959c14
14/11/06 19:34:50 INFO dstream.ShuffledDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.ShuffledDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.ShuffledDStream: Checkpoint interval = null
14/11/06 19:34:50 INFO dstream.ShuffledDStream: Remember duration = 2000 ms
14/11/06 19:34:50 INFO dstream.ShuffledDStream: Initialized and validated
org.apache.spark.streaming.dstream.ShuffledDStream@2166ba93
14/11/06 19:34:50 INFO dstream.ForEachDStream: Slide time = 2000 ms
14/11/06 19:34:50 INFO dstream.ForEachDStream: Storage level =
StorageLevel(false, false, false, false, 1)
14/11/06 19:34:50 INFO dstream.ForEachDStream: Checkpoint interval = null
14/11/06 19:34:50 INFO dstream.ForEachDStream: Remember duration = 2000 ms
14/11/06 19:34:50 INFO dstream.ForEachDStream: Initialized and validated
org.apache.spark.streaming.dstream.ForEachDStream@5f631a06
14/11/06 19:34:50 INFO spark.SparkContext: Starting job: collect at
ReceiverTracker.scala:270
14/11/06 19:34:50 INFO scheduler.DAGScheduler: Registering RDD 2 (map at
ReceiverTracker.scala:270)
14/11/06 19:34:50 INFO scheduler.DAGScheduler: Got job 0 (collect at
ReceiverTracker.scala:270) with 20 output partitions (allowLocal=false)
14/11/06 19:34:50 INFO scheduler.DAGScheduler: Final stage: Stage 0(collect
at ReceiverTracker.scala:270)
14/11/06 19:34:50 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 1)
14/11/06 19:34:50 INFO util.RecurringTimer: Started timer for JobGenerator
at time 1415302492000
14/11/06 19:34:50 INFO scheduler.JobGenerator: Started JobGenerator at
1415302492000 ms
14/11/06 19:34:50 INFO scheduler.JobScheduler: Started JobScheduler
14/11/06 19:34:50 INFO scheduler.DAGScheduler: Missing parents: List(Stage
1)
14/11/06 19:34:50 INFO scheduler.ReceiverTracker: Sent stop signal to all 0
receivers
14/11/06 19:34:50 INFO scheduler.DAGScheduler: Submitting Stage 1
(MappedRDD[2] at map at ReceiverTracker.scala:270), which has no missing
parents
14/11/06 19:34:50 INFO storage.MemoryStore: ensureFreeSpace(2720) called
with curMem=0, maxMem=278302556
14/11/06 19:34:50 INFO storage.MemoryStore: Block broadcast_0 stored as
values in memory (estimated size 2.7 KB, free 265.4 MB)
14/11/06 19:34:50 INFO storage.BlockManagerMasterActor: Registering block
manager ip-172-31-24-44.us-west-2.compute.internal:34697 with 534.5 MB RAM
14/11/06 19:34:51 INFO storage.MemoryStore: ensureFreeSpace(1611) called
with curMem=2720, maxMem=278302556
14/11/06 19:34:51 INFO storage.MemoryStore: Block broadcast_0_piece0 stored
as bytes in memory (estimated size 1611.0 B, free 265.4 MB)
14/11/06 19:34:51 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1611.0 B,
free: 265.4 MB)
14/11/06 19:34:51 INFO storage.BlockManagerMaster: Updated info of block
broadcast_0_piece0
14/11/06 19:34:51 INFO scheduler.DAGScheduler: Submitting 50 missing tasks
from Stage 1 (MappedRDD[2] at map at ReceiverTracker.scala:270)
14/11/06 19:34:51 INFO cluster.YarnClientClusterScheduler: Adding task set
1.0 with 50 tasks
14/11/06 19:34:51 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
1.0 (TID 0, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:51 INFO scheduler.TaskSetManager: Starting task 1.0 in stage
1.0 (TID 1, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:52 INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks
14/11/06 19:34:52 INFO scheduler.JobScheduler: Added jobs for time
1415302492000 ms
14/11/06 19:34:52 INFO scheduler.JobScheduler: Starting job streaming job
1415302492000 ms.0 from job set of time 1415302492000 ms
14/11/06 19:34:52 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Registering RDD 8 (map at
MappedDStream.scala:35)
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Got job 1 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Final stage: Stage 2(take at
DStream.scala:608)
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 3)
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Submitting Stage 2
(ShuffledRDD[9] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:52 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=4331, maxMem=278302556
14/11/06 19:34:52 INFO storage.MemoryStore: Block broadcast_1 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:52 INFO storage.MemoryStore: ensureFreeSpace(1450) called
with curMem=6611, maxMem=278302556
14/11/06 19:34:52 INFO storage.MemoryStore: Block broadcast_1_piece0 stored
as bytes in memory (estimated size 1450.0 B, free 265.4 MB)
14/11/06 19:34:52 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1450.0 B,
free: 265.4 MB)
14/11/06 19:34:52 INFO storage.BlockManagerMaster: Updated info of block
broadcast_1_piece0
14/11/06 19:34:52 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 2 (ShuffledRDD[9] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:52 INFO cluster.YarnClientClusterScheduler: Adding task set
2.0 with 1 tasks
14/11/06 19:34:52 INFO network.ConnectionManager: Accepted connection from
[ip-172-31-24-43.us-west-2.compute.internal/172.31.24.43:43656]
14/11/06 19:34:52 INFO network.SendingConnection: Initiating connection to
[ip-172-31-24-43.us-west-2.compute.internal/172.31.24.43:51054]
14/11/06 19:34:52 INFO network.SendingConnection: Connected to
[ip-172-31-24-43.us-west-2.compute.internal/172.31.24.43:51054], 1 messages
pending
14/11/06 19:34:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1611.0 B,
free: 534.5 MB)
14/11/06 19:34:53 INFO network.ConnectionManager: Accepted connection from
[ip-172-31-24-44.us-west-2.compute.internal/172.31.24.44:58465]
14/11/06 19:34:53 INFO network.SendingConnection: Initiating connection to
[ip-172-31-24-44.us-west-2.compute.internal/172.31.24.44:34697]
14/11/06 19:34:53 INFO network.SendingConnection: Connected to
[ip-172-31-24-44.us-west-2.compute.internal/172.31.24.44:34697], 1 messages
pending
14/11/06 19:34:53 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in
memory on ip-172-31-24-44.us-west-2.compute.internal:34697 (size: 1611.0 B,
free: 534.5 MB)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 2.0 in stage
1.0 (TID 2, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 1.0 in stage
1.0 (TID 1) in 1914 ms on ip-172-31-24-43.us-west-2.compute.internal (1/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 3.0 in stage
1.0 (TID 3, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 2.0 in stage
1.0 (TID 2) in 31 ms on ip-172-31-24-43.us-west-2.compute.internal (2/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 4.0 in stage
1.0 (TID 4, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 3.0 in stage
1.0 (TID 3) in 41 ms on ip-172-31-24-43.us-west-2.compute.internal (3/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 5.0 in stage
1.0 (TID 5, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 4.0 in stage
1.0 (TID 4) in 50 ms on ip-172-31-24-43.us-west-2.compute.internal (4/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 6.0 in stage
1.0 (TID 6, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 5.0 in stage
1.0 (TID 5) in 26 ms on ip-172-31-24-43.us-west-2.compute.internal (5/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 7.0 in stage
1.0 (TID 7, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 6.0 in stage
1.0 (TID 6) in 23 ms on ip-172-31-24-43.us-west-2.compute.internal (6/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 8.0 in stage
1.0 (TID 8, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 7.0 in stage
1.0 (TID 7) in 34 ms on ip-172-31-24-43.us-west-2.compute.internal (7/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 9.0 in stage
1.0 (TID 9, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 8.0 in stage
1.0 (TID 8) in 23 ms on ip-172-31-24-43.us-west-2.compute.internal (8/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 10.0 in stage
1.0 (TID 10, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 9.0 in stage
1.0 (TID 9) in 19 ms on ip-172-31-24-43.us-west-2.compute.internal (9/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 11.0 in stage
1.0 (TID 11, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 10.0 in stage
1.0 (TID 10) in 22 ms on ip-172-31-24-43.us-west-2.compute.internal (10/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 12.0 in stage
1.0 (TID 12, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 11.0 in stage
1.0 (TID 11) in 34 ms on ip-172-31-24-43.us-west-2.compute.internal (11/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 13.0 in stage
1.0 (TID 13, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 12.0 in stage
1.0 (TID 12) in 22 ms on ip-172-31-24-43.us-west-2.compute.internal (12/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 14.0 in stage
1.0 (TID 14, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 13.0 in stage
1.0 (TID 13) in 25 ms on ip-172-31-24-43.us-west-2.compute.internal (13/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 15.0 in stage
1.0 (TID 15, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
1.0 (TID 0) in 2258 ms on ip-172-31-24-44.us-west-2.compute.internal (14/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 16.0 in stage
1.0 (TID 16, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 14.0 in stage
1.0 (TID 14) in 20 ms on ip-172-31-24-43.us-west-2.compute.internal (15/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 17.0 in stage
1.0 (TID 17, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 15.0 in stage
1.0 (TID 15) in 30 ms on ip-172-31-24-44.us-west-2.compute.internal (16/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 18.0 in stage
1.0 (TID 18, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 16.0 in stage
1.0 (TID 16) in 29 ms on ip-172-31-24-43.us-west-2.compute.internal (17/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 19.0 in stage
1.0 (TID 19, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 17.0 in stage
1.0 (TID 17) in 36 ms on ip-172-31-24-44.us-west-2.compute.internal (18/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 20.0 in stage
1.0 (TID 20, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 18.0 in stage
1.0 (TID 18) in 23 ms on ip-172-31-24-43.us-west-2.compute.internal (19/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 21.0 in stage
1.0 (TID 21, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 20.0 in stage
1.0 (TID 20) in 22 ms on ip-172-31-24-43.us-west-2.compute.internal (20/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 22.0 in stage
1.0 (TID 22, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 19.0 in stage
1.0 (TID 19) in 40 ms on ip-172-31-24-44.us-west-2.compute.internal (21/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 23.0 in stage
1.0 (TID 23, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 21.0 in stage
1.0 (TID 21) in 22 ms on ip-172-31-24-43.us-west-2.compute.internal (22/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 24.0 in stage
1.0 (TID 24, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 23.0 in stage
1.0 (TID 23) in 21 ms on ip-172-31-24-43.us-west-2.compute.internal (23/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 25.0 in stage
1.0 (TID 25, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 22.0 in stage
1.0 (TID 22) in 31 ms on ip-172-31-24-44.us-west-2.compute.internal (24/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 26.0 in stage
1.0 (TID 26, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 24.0 in stage
1.0 (TID 24) in 25 ms on ip-172-31-24-43.us-west-2.compute.internal (25/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 27.0 in stage
1.0 (TID 27, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 25.0 in stage
1.0 (TID 25) in 25 ms on ip-172-31-24-44.us-west-2.compute.internal (26/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 28.0 in stage
1.0 (TID 28, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 26.0 in stage
1.0 (TID 26) in 22 ms on ip-172-31-24-43.us-west-2.compute.internal (27/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 29.0 in stage
1.0 (TID 29, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 27.0 in stage
1.0 (TID 27) in 31 ms on ip-172-31-24-44.us-west-2.compute.internal (28/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 30.0 in stage
1.0 (TID 30, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 28.0 in stage
1.0 (TID 28) in 20 ms on ip-172-31-24-43.us-west-2.compute.internal (29/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 31.0 in stage
1.0 (TID 31, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 29.0 in stage
1.0 (TID 29) in 21 ms on ip-172-31-24-44.us-west-2.compute.internal (30/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 32.0 in stage
1.0 (TID 32, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 30.0 in stage
1.0 (TID 30) in 21 ms on ip-172-31-24-43.us-west-2.compute.internal (31/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 33.0 in stage
1.0 (TID 33, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 31.0 in stage
1.0 (TID 31) in 21 ms on ip-172-31-24-44.us-west-2.compute.internal (32/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 34.0 in stage
1.0 (TID 34, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 32.0 in stage
1.0 (TID 32) in 21 ms on ip-172-31-24-43.us-west-2.compute.internal (33/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 35.0 in stage
1.0 (TID 35, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 33.0 in stage
1.0 (TID 33) in 25 ms on ip-172-31-24-44.us-west-2.compute.internal (34/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 36.0 in stage
1.0 (TID 36, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 34.0 in stage
1.0 (TID 34) in 40 ms on ip-172-31-24-43.us-west-2.compute.internal (35/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 37.0 in stage
1.0 (TID 37, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 35.0 in stage
1.0 (TID 35) in 29 ms on ip-172-31-24-44.us-west-2.compute.internal (36/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 38.0 in stage
1.0 (TID 38, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 37.0 in stage
1.0 (TID 37) in 25 ms on ip-172-31-24-44.us-west-2.compute.internal (37/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 39.0 in stage
1.0 (TID 39, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 36.0 in stage
1.0 (TID 36) in 37 ms on ip-172-31-24-43.us-west-2.compute.internal (38/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 40.0 in stage
1.0 (TID 40, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 38.0 in stage
1.0 (TID 38) in 25 ms on ip-172-31-24-44.us-west-2.compute.internal (39/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 41.0 in stage
1.0 (TID 41, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 39.0 in stage
1.0 (TID 39) in 21 ms on ip-172-31-24-43.us-west-2.compute.internal (40/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 42.0 in stage
1.0 (TID 42, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 41.0 in stage
1.0 (TID 41) in 20 ms on ip-172-31-24-43.us-west-2.compute.internal (41/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 43.0 in stage
1.0 (TID 43, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 40.0 in stage
1.0 (TID 40) in 26 ms on ip-172-31-24-44.us-west-2.compute.internal (42/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 44.0 in stage
1.0 (TID 44, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 42.0 in stage
1.0 (TID 42) in 20 ms on ip-172-31-24-43.us-west-2.compute.internal (43/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 45.0 in stage
1.0 (TID 45, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 43.0 in stage
1.0 (TID 43) in 28 ms on ip-172-31-24-44.us-west-2.compute.internal (44/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 46.0 in stage
1.0 (TID 46, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 44.0 in stage
1.0 (TID 44) in 19 ms on ip-172-31-24-43.us-west-2.compute.internal (45/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 47.0 in stage
1.0 (TID 47, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 46.0 in stage
1.0 (TID 46) in 24 ms on ip-172-31-24-43.us-west-2.compute.internal (46/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 45.0 in stage
1.0 (TID 45) in 35 ms on ip-172-31-24-44.us-west-2.compute.internal (47/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 48.0 in stage
1.0 (TID 48, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 49.0 in stage
1.0 (TID 49, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1227
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 48.0 in stage
1.0 (TID 48) in 21 ms on ip-172-31-24-44.us-west-2.compute.internal (48/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
2.0 (TID 50, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 47.0 in stage
1.0 (TID 47) in 32 ms on ip-172-31-24-43.us-west-2.compute.internal (49/50)
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Finished task 49.0 in stage
1.0 (TID 49) in 21 ms on ip-172-31-24-44.us-west-2.compute.internal (50/50)
14/11/06 19:34:53 INFO scheduler.DAGScheduler: Stage 1 (map at
ReceiverTracker.scala:270) finished in 2.714 s
14/11/06 19:34:53 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
1.0, whose tasks have all completed, from pool
14/11/06 19:34:53 INFO scheduler.DAGScheduler: looking for newly runnable
stages
14/11/06 19:34:53 INFO scheduler.DAGScheduler: running: Set(Stage 2)
14/11/06 19:34:53 INFO scheduler.DAGScheduler: waiting: Set(Stage 0)
14/11/06 19:34:53 INFO scheduler.DAGScheduler: failed: Set()
14/11/06 19:34:53 INFO scheduler.DAGScheduler: Missing parents for Stage 0:
List()
14/11/06 19:34:53 INFO scheduler.DAGScheduler: Submitting Stage 0
(ShuffledRDD[3] at reduceByKey at ReceiverTracker.scala:270), which is now
runnable
14/11/06 19:34:53 INFO storage.MemoryStore: ensureFreeSpace(2232) called
with curMem=8061, maxMem=278302556
14/11/06 19:34:53 INFO storage.MemoryStore: Block broadcast_2 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:53 INFO storage.MemoryStore: ensureFreeSpace(1384) called
with curMem=10293, maxMem=278302556
14/11/06 19:34:53 INFO storage.MemoryStore: Block broadcast_2_piece0 stored
as bytes in memory (estimated size 1384.0 B, free 265.4 MB)
14/11/06 19:34:53 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1384.0 B,
free: 265.4 MB)
14/11/06 19:34:53 INFO storage.BlockManagerMaster: Updated info of block
broadcast_2_piece0
14/11/06 19:34:53 INFO scheduler.DAGScheduler: Submitting 20 missing tasks
from Stage 0 (ShuffledRDD[3] at reduceByKey at ReceiverTracker.scala:270)
14/11/06 19:34:53 INFO cluster.YarnClientClusterScheduler: Adding task set
0.0 with 20 tasks
14/11/06 19:34:53 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
0.0 (TID 51, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:53 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1450.0 B,
free: 534.5 MB)
14/11/06 19:34:53 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in
memory on ip-172-31-24-44.us-west-2.compute.internal:34697 (size: 1384.0 B,
free: 534.5 MB)
14/11/06 19:34:53 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to
sparkExecutor@ip-172-31-24-44.us-west-2.compute.internal:33000
14/11/06 19:34:54 INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks
14/11/06 19:34:54 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 0 is 349 bytes
14/11/06 19:34:54 INFO scheduler.JobScheduler: Added jobs for time
1415302494000 ms
14/11/06 19:34:54 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 1 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:34:54 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 1 is 82 bytes
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 1.0 in stage
0.0 (TID 52, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
2.0 (TID 50) in 185 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Stage 2 (take at
DStream.scala:608) finished in 2.024 s
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
2.0, whose tasks have all completed, from pool
14/11/06 19:34:54 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 2.042555819 s
14/11/06 19:34:54 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Got job 2 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Final stage: Stage 4(take at
DStream.scala:608)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 5)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting Stage 4
(ShuffledRDD[9] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=11677, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_3 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(1450) called
with curMem=13957, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_3_piece0 stored
as bytes in memory (estimated size 1450.0 B, free 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1450.0 B,
free: 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerMaster: Updated info of block
broadcast_3_piece0
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 4 (ShuffledRDD[9] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Adding task set
4.0 with 1 tasks
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1384.0 B,
free: 534.5 MB)
14/11/06 19:34:54 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 2.0 in stage
0.0 (TID 53, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
0.0 (TID 51) in 414 ms on ip-172-31-24-44.us-west-2.compute.internal (1/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 3.0 in stage
0.0 (TID 54, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 1.0 in stage
0.0 (TID 52) in 286 ms on ip-172-31-24-43.us-west-2.compute.internal (2/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 4.0 in stage
0.0 (TID 55, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 2.0 in stage
0.0 (TID 53) in 53 ms on ip-172-31-24-44.us-west-2.compute.internal (3/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 5.0 in stage
0.0 (TID 56, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 3.0 in stage
0.0 (TID 54) in 37 ms on ip-172-31-24-43.us-west-2.compute.internal (4/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 6.0 in stage
0.0 (TID 57, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 4.0 in stage
0.0 (TID 55) in 45 ms on ip-172-31-24-44.us-west-2.compute.internal (5/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 7.0 in stage
0.0 (TID 58, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 5.0 in stage
0.0 (TID 56) in 45 ms on ip-172-31-24-43.us-west-2.compute.internal (6/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 8.0 in stage
0.0 (TID 59, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 7.0 in stage
0.0 (TID 58) in 37 ms on ip-172-31-24-43.us-west-2.compute.internal (7/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 9.0 in stage
0.0 (TID 60, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 6.0 in stage
0.0 (TID 57) in 47 ms on ip-172-31-24-44.us-west-2.compute.internal (8/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 10.0 in stage
0.0 (TID 61, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 8.0 in stage
0.0 (TID 59) in 32 ms on ip-172-31-24-43.us-west-2.compute.internal (9/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 11.0 in stage
0.0 (TID 62, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 9.0 in stage
0.0 (TID 60) in 39 ms on ip-172-31-24-44.us-west-2.compute.internal (10/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 12.0 in stage
0.0 (TID 63, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 10.0 in stage
0.0 (TID 61) in 41 ms on ip-172-31-24-43.us-west-2.compute.internal (11/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 13.0 in stage
0.0 (TID 64, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 11.0 in stage
0.0 (TID 62) in 56 ms on ip-172-31-24-44.us-west-2.compute.internal (12/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 14.0 in stage
0.0 (TID 65, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 12.0 in stage
0.0 (TID 63) in 33 ms on ip-172-31-24-43.us-west-2.compute.internal (13/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 15.0 in stage
0.0 (TID 66, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 13.0 in stage
0.0 (TID 64) in 37 ms on ip-172-31-24-44.us-west-2.compute.internal (14/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 16.0 in stage
0.0 (TID 67, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 14.0 in stage
0.0 (TID 65) in 37 ms on ip-172-31-24-43.us-west-2.compute.internal (15/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 17.0 in stage
0.0 (TID 68, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 15.0 in stage
0.0 (TID 66) in 36 ms on ip-172-31-24-44.us-west-2.compute.internal (16/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 18.0 in stage
0.0 (TID 69, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 16.0 in stage
0.0 (TID 67) in 40 ms on ip-172-31-24-43.us-west-2.compute.internal (17/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 19.0 in stage
0.0 (TID 70, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 17.0 in stage
0.0 (TID 68) in 47 ms on ip-172-31-24-44.us-west-2.compute.internal (18/20)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
4.0 (TID 71, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 18.0 in stage
0.0 (TID 69) in 45 ms on ip-172-31-24-43.us-west-2.compute.internal (19/20)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Stage 0 (collect at
ReceiverTracker.scala:270) finished in 0.794 s
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 19.0 in stage
0.0 (TID 70) in 28 ms on ip-172-31-24-44.us-west-2.compute.internal (20/20)
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
0.0, whose tasks have all completed, from pool
14/11/06 19:34:54 INFO spark.SparkContext: Job finished: collect at
ReceiverTracker.scala:270, took 3.92090434 s
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1450.0 B,
free: 534.5 MB)
14/11/06 19:34:54 INFO scheduler.ReceiverTracker: Starting 1 receivers
14/11/06 19:34:54 INFO spark.SparkContext: Starting job: runJob at
ReceiverTracker.scala:275
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Got job 3 (runJob at
ReceiverTracker.scala:275) with 1 output partitions (allowLocal=false)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Final stage: Stage 6(runJob
at ReceiverTracker.scala:275)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Parents of final stage:
List()
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting Stage 6
(ParallelCollectionRDD[0] at makeRDD at ReceiverTracker.scala:253), which
has no missing parents
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(1216) called
with curMem=15407, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_4 stored as
values in memory (estimated size 1216.0 B, free 265.4 MB)
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(842) called with
curMem=16623, maxMem=278302556
14/11/06 19:34:54 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 1 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_4_piece0 stored
as bytes in memory (estimated size 842.0 B, free 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 842.0 B,
free: 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerMaster: Updated info of block
broadcast_4_piece0
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 6 (ParallelCollectionRDD[0] at makeRDD at
ReceiverTracker.scala:253)
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Adding task set
6.0 with 1 tasks
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
6.0 (TID 72, ip-172-31-24-44.us-west-2.compute.internal, PROCESS_LOCAL, 2539
bytes)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Stage 4 (take at
DStream.scala:608) finished in 0.644 s
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
4.0 (TID 71) in 40 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
4.0, whose tasks have all completed, from pool
14/11/06 19:34:54 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.654405309 s
-------------------------------------------
Time: 1415302492000 ms
-------------------------------------------

14/11/06 19:34:54 INFO scheduler.JobScheduler: Finished job streaming job
1415302492000 ms.0 from job set of time 1415302492000 ms
14/11/06 19:34:54 INFO scheduler.JobScheduler: Total delay: 2.765 s for time
1415302492000 ms (execution: 2.705 s)
14/11/06 19:34:54 INFO scheduler.JobScheduler: Starting job streaming job
1415302494000 ms.0 from job set of time 1415302494000 ms
14/11/06 19:34:54 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Registering RDD 14 (map at
MappedDStream.scala:35)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Got job 4 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Final stage: Stage 7(take at
DStream.scala:608)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 8)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting Stage 7
(ShuffledRDD[15] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=17465, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_5 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(1448) called
with curMem=19745, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_5_piece0 stored
as bytes in memory (estimated size 1448.0 B, free 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1448.0 B,
free: 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerMaster: Updated info of block
broadcast_5_piece0
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 7 (ShuffledRDD[15] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Adding task set
7.0 with 1 tasks
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
7.0 (TID 73, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1448.0 B,
free: 534.5 MB)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in
memory on ip-172-31-24-44.us-west-2.compute.internal:34697 (size: 842.0 B,
free: 534.5 MB)
14/11/06 19:34:54 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 2 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:34:54 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 2 is 82 bytes
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
7.0 (TID 73) in 42 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Stage 7 (take at
DStream.scala:608) finished in 0.043 s
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
7.0, whose tasks have all completed, from pool
14/11/06 19:34:54 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.04950038 s
14/11/06 19:34:54 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Got job 5 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Final stage: Stage 9(take at
DStream.scala:608)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 10)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting Stage 9
(ShuffledRDD[15] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=21193, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_6 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:54 INFO storage.MemoryStore: ensureFreeSpace(1448) called
with curMem=23473, maxMem=278302556
14/11/06 19:34:54 INFO storage.MemoryStore: Block broadcast_6_piece0 stored
as bytes in memory (estimated size 1448.0 B, free 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1448.0 B,
free: 265.4 MB)
14/11/06 19:34:54 INFO storage.BlockManagerMaster: Updated info of block
broadcast_6_piece0
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 9 (ShuffledRDD[15] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Adding task set
9.0 with 1 tasks
14/11/06 19:34:54 INFO scheduler.ReceiverTracker: Registered receiver for
stream 0 from
akka.tcp://sparkExecutor@ip-172-31-24-44.us-west-2.compute.internal:33000
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
9.0 (TID 74, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL, 1034
bytes)
14/11/06 19:34:54 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1448.0 B,
free: 534.5 MB)
14/11/06 19:34:54 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
9.0 (TID 74) in 41 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:54 INFO scheduler.DAGScheduler: Stage 9 (take at
DStream.scala:608) finished in 0.042 s
14/11/06 19:34:54 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
9.0, whose tasks have all completed, from pool
14/11/06 19:34:54 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.049448566 s
-------------------------------------------
Time: 1415302494000 ms
-------------------------------------------

14/11/06 19:34:54 INFO scheduler.JobScheduler: Finished job streaming job
1415302494000 ms.0 from job set of time 1415302494000 ms
14/11/06 19:34:54 INFO scheduler.JobScheduler: Total delay: 0.872 s for time
1415302494000 ms (execution: 0.104 s)
14/11/06 19:34:54 INFO rdd.ShuffledRDD: Removing RDD 9 from persistence list
14/11/06 19:34:54 INFO storage.BlockManager: Removing RDD 9
14/11/06 19:34:54 INFO rdd.MappedRDD: Removing RDD 8 from persistence list
14/11/06 19:34:54 INFO storage.BlockManager: Removing RDD 8
14/11/06 19:34:54 INFO rdd.MappedRDD: Removing RDD 7 from persistence list
14/11/06 19:34:54 INFO storage.BlockManager: Removing RDD 7
14/11/06 19:34:54 INFO rdd.FilteredRDD: Removing RDD 6 from persistence list
14/11/06 19:34:54 INFO storage.BlockManager: Removing RDD 6
14/11/06 19:34:54 INFO rdd.MappedRDD: Removing RDD 5 from persistence list
14/11/06 19:34:54 INFO storage.BlockManager: Removing RDD 5
14/11/06 19:34:54 INFO rdd.BlockRDD: Removing RDD 4 from persistence list
14/11/06 19:34:54 INFO storage.BlockManager: Removing RDD 4
14/11/06 19:34:54 INFO dstream.PluggableInputDStream: Removing blocks of RDD
BlockRDD[4] at BlockRDD at ReceiverInputDStream.scala:69 of time
1415302494000 ms
14/11/06 19:34:56 INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks
14/11/06 19:34:56 INFO scheduler.JobScheduler: Added jobs for time
1415302496000 ms
14/11/06 19:34:56 INFO scheduler.JobScheduler: Starting job streaming job
1415302496000 ms.0 from job set of time 1415302496000 ms
14/11/06 19:34:56 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Registering RDD 20 (map at
MappedDStream.scala:35)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Got job 6 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Final stage: Stage 11(take at
DStream.scala:608)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 12)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Submitting Stage 11
(ShuffledRDD[21] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:56 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=24921, maxMem=278302556
14/11/06 19:34:56 INFO storage.MemoryStore: Block broadcast_7 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:56 INFO storage.MemoryStore: ensureFreeSpace(1450) called
with curMem=27201, maxMem=278302556
14/11/06 19:34:56 INFO storage.MemoryStore: Block broadcast_7_piece0 stored
as bytes in memory (estimated size 1450.0 B, free 265.4 MB)
14/11/06 19:34:56 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1450.0 B,
free: 265.4 MB)
14/11/06 19:34:56 INFO storage.BlockManagerMaster: Updated info of block
broadcast_7_piece0
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 11 (ShuffledRDD[21] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:56 INFO cluster.YarnClientClusterScheduler: Adding task set
11.0 with 1 tasks
14/11/06 19:34:56 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
11.0 (TID 75, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL,
1034 bytes)
14/11/06 19:34:56 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1450.0 B,
free: 534.5 MB)
14/11/06 19:34:56 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 3 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:34:56 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 3 is 82 bytes
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Stage 11 (take at
DStream.scala:608) finished in 0.041 s
14/11/06 19:34:56 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
11.0 (TID 75) in 41 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:56 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
11.0, whose tasks have all completed, from pool
14/11/06 19:34:56 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.048363634 s
14/11/06 19:34:56 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Got job 7 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Final stage: Stage 13(take at
DStream.scala:608)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 14)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Submitting Stage 13
(ShuffledRDD[21] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:56 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=28651, maxMem=278302556
14/11/06 19:34:56 INFO storage.MemoryStore: Block broadcast_8 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:56 INFO storage.MemoryStore: ensureFreeSpace(1450) called
with curMem=30931, maxMem=278302556
14/11/06 19:34:56 INFO storage.MemoryStore: Block broadcast_8_piece0 stored
as bytes in memory (estimated size 1450.0 B, free 265.4 MB)
14/11/06 19:34:56 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1450.0 B,
free: 265.4 MB)
14/11/06 19:34:56 INFO storage.BlockManagerMaster: Updated info of block
broadcast_8_piece0
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 13 (ShuffledRDD[21] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:56 INFO cluster.YarnClientClusterScheduler: Adding task set
13.0 with 1 tasks
14/11/06 19:34:56 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
13.0 (TID 76, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL,
1034 bytes)
14/11/06 19:34:56 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1450.0 B,
free: 534.5 MB)
14/11/06 19:34:56 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
13.0 (TID 76) in 39 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:56 INFO scheduler.DAGScheduler: Stage 13 (take at
DStream.scala:608) finished in 0.039 s
14/11/06 19:34:56 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
13.0, whose tasks have all completed, from pool
14/11/06 19:34:56 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.045834556 s
-------------------------------------------
Time: 1415302496000 ms
-------------------------------------------

14/11/06 19:34:56 INFO scheduler.JobScheduler: Finished job streaming job
1415302496000 ms.0 from job set of time 1415302496000 ms
14/11/06 19:34:56 INFO scheduler.JobScheduler: Total delay: 0.108 s for time
1415302496000 ms (execution: 0.099 s)
14/11/06 19:34:56 INFO rdd.ShuffledRDD: Removing RDD 15 from persistence
list
14/11/06 19:34:56 INFO storage.BlockManager: Removing RDD 15
14/11/06 19:34:56 INFO rdd.MappedRDD: Removing RDD 14 from persistence list
14/11/06 19:34:56 INFO storage.BlockManager: Removing RDD 14
14/11/06 19:34:56 INFO rdd.MappedRDD: Removing RDD 13 from persistence list
14/11/06 19:34:56 INFO storage.BlockManager: Removing RDD 13
14/11/06 19:34:56 INFO rdd.FilteredRDD: Removing RDD 12 from persistence
list
14/11/06 19:34:56 INFO storage.BlockManager: Removing RDD 12
14/11/06 19:34:56 INFO rdd.MappedRDD: Removing RDD 11 from persistence list
14/11/06 19:34:56 INFO storage.BlockManager: Removing RDD 11
14/11/06 19:34:56 INFO rdd.BlockRDD: Removing RDD 10 from persistence list
14/11/06 19:34:56 INFO storage.BlockManager: Removing RDD 10
14/11/06 19:34:56 INFO dstream.PluggableInputDStream: Removing blocks of RDD
BlockRDD[10] at BlockRDD at ReceiverInputDStream.scala:69 of time
1415302496000 ms
14/11/06 19:34:58 INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks
14/11/06 19:34:58 INFO scheduler.JobScheduler: Added jobs for time
1415302498000 ms
14/11/06 19:34:58 INFO scheduler.JobScheduler: Starting job streaming job
1415302498000 ms.0 from job set of time 1415302498000 ms
14/11/06 19:34:58 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Registering RDD 26 (map at
MappedDStream.scala:35)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Got job 8 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Final stage: Stage 15(take at
DStream.scala:608)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 16)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Submitting Stage 15
(ShuffledRDD[27] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:58 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=32381, maxMem=278302556
14/11/06 19:34:58 INFO storage.MemoryStore: Block broadcast_9 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:58 INFO storage.MemoryStore: ensureFreeSpace(1452) called
with curMem=34661, maxMem=278302556
14/11/06 19:34:58 INFO storage.MemoryStore: Block broadcast_9_piece0 stored
as bytes in memory (estimated size 1452.0 B, free 265.4 MB)
14/11/06 19:34:58 INFO storage.BlockManagerInfo: Added broadcast_9_piece0 in
memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1452.0 B,
free: 265.4 MB)
14/11/06 19:34:58 INFO storage.BlockManagerMaster: Updated info of block
broadcast_9_piece0
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 15 (ShuffledRDD[27] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:58 INFO cluster.YarnClientClusterScheduler: Adding task set
15.0 with 1 tasks
14/11/06 19:34:58 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
15.0 (TID 77, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL,
1034 bytes)
14/11/06 19:34:58 INFO storage.BlockManagerInfo: Added broadcast_9_piece0 in
memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1452.0 B,
free: 534.5 MB)
14/11/06 19:34:58 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 4 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:34:58 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 4 is 82 bytes
14/11/06 19:34:58 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
15.0 (TID 77) in 37 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Stage 15 (take at
DStream.scala:608) finished in 0.038 s
14/11/06 19:34:58 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
15.0, whose tasks have all completed, from pool
14/11/06 19:34:58 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.044597389 s
14/11/06 19:34:58 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Got job 9 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Final stage: Stage 17(take at
DStream.scala:608)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 18)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Submitting Stage 17
(ShuffledRDD[27] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:34:58 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=36113, maxMem=278302556
14/11/06 19:34:58 INFO storage.MemoryStore: Block broadcast_10 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:34:58 INFO storage.MemoryStore: ensureFreeSpace(1452) called
with curMem=38393, maxMem=278302556
14/11/06 19:34:58 INFO storage.MemoryStore: Block broadcast_10_piece0 stored
as bytes in memory (estimated size 1452.0 B, free 265.4 MB)
14/11/06 19:34:58 INFO storage.BlockManagerInfo: Added broadcast_10_piece0
in memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1452.0
B, free: 265.4 MB)
14/11/06 19:34:58 INFO storage.BlockManagerMaster: Updated info of block
broadcast_10_piece0
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 17 (ShuffledRDD[27] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:34:58 INFO cluster.YarnClientClusterScheduler: Adding task set
17.0 with 1 tasks
14/11/06 19:34:58 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
17.0 (TID 78, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL,
1034 bytes)
14/11/06 19:34:58 INFO storage.BlockManagerInfo: Added broadcast_10_piece0
in memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1452.0
B, free: 534.5 MB)
14/11/06 19:34:58 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
17.0 (TID 78) in 30 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:34:58 INFO scheduler.DAGScheduler: Stage 17 (take at
DStream.scala:608) finished in 0.031 s
14/11/06 19:34:58 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
17.0, whose tasks have all completed, from pool
14/11/06 19:34:58 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.037478645 s
-------------------------------------------
Time: 1415302498000 ms
-------------------------------------------

14/11/06 19:34:58 INFO scheduler.JobScheduler: Finished job streaming job
1415302498000 ms.0 from job set of time 1415302498000 ms
14/11/06 19:34:58 INFO rdd.ShuffledRDD: Removing RDD 21 from persistence
list
14/11/06 19:34:58 INFO scheduler.JobScheduler: Total delay: 0.095 s for time
1415302498000 ms (execution: 0.086 s)
14/11/06 19:34:58 INFO storage.BlockManager: Removing RDD 21
14/11/06 19:34:58 INFO rdd.MappedRDD: Removing RDD 20 from persistence list
14/11/06 19:34:58 INFO storage.BlockManager: Removing RDD 20
14/11/06 19:34:58 INFO rdd.MappedRDD: Removing RDD 19 from persistence list
14/11/06 19:34:58 INFO storage.BlockManager: Removing RDD 19
14/11/06 19:34:58 INFO rdd.FilteredRDD: Removing RDD 18 from persistence
list
14/11/06 19:34:58 INFO rdd.MappedRDD: Removing RDD 17 from persistence list
14/11/06 19:34:58 INFO storage.BlockManager: Removing RDD 18
14/11/06 19:34:58 INFO rdd.BlockRDD: Removing RDD 16 from persistence list
14/11/06 19:34:58 INFO storage.BlockManager: Removing RDD 17
14/11/06 19:34:58 INFO storage.BlockManager: Removing RDD 16
14/11/06 19:34:58 INFO dstream.PluggableInputDStream: Removing blocks of RDD
BlockRDD[16] at BlockRDD at ReceiverInputDStream.scala:69 of time
1415302498000 ms
14/11/06 19:35:00 INFO scheduler.ReceiverTracker: Stream 0 received 0 blocks
14/11/06 19:35:00 INFO scheduler.JobScheduler: Added jobs for time
1415302500000 ms
14/11/06 19:35:00 INFO scheduler.JobScheduler: Starting job streaming job
1415302500000 ms.0 from job set of time 1415302500000 ms
14/11/06 19:35:00 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Registering RDD 32 (map at
MappedDStream.scala:35)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Got job 10 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Final stage: Stage 19(take at
DStream.scala:608)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 20)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Submitting Stage 19
(ShuffledRDD[33] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:35:00 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=39845, maxMem=278302556
14/11/06 19:35:00 INFO storage.MemoryStore: Block broadcast_11 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:35:00 INFO storage.MemoryStore: ensureFreeSpace(1451) called
with curMem=42125, maxMem=278302556
14/11/06 19:35:00 INFO storage.MemoryStore: Block broadcast_11_piece0 stored
as bytes in memory (estimated size 1451.0 B, free 265.4 MB)
14/11/06 19:35:00 INFO storage.BlockManagerInfo: Added broadcast_11_piece0
in memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1451.0
B, free: 265.4 MB)
14/11/06 19:35:00 INFO storage.BlockManagerMaster: Updated info of block
broadcast_11_piece0
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 19 (ShuffledRDD[33] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:35:00 INFO cluster.YarnClientClusterScheduler: Adding task set
19.0 with 1 tasks
14/11/06 19:35:00 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
19.0 (TID 79, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL,
1034 bytes)
14/11/06 19:35:00 INFO storage.BlockManagerInfo: Added broadcast_11_piece0
in memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1451.0
B, free: 534.5 MB)
14/11/06 19:35:00 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 5 to
sparkExecutor@ip-172-31-24-43.us-west-2.compute.internal:38489
14/11/06 19:35:00 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 5 is 82 bytes
14/11/06 19:35:00 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
19.0 (TID 79) in 40 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Stage 19 (take at
DStream.scala:608) finished in 0.040 s
14/11/06 19:35:00 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
19.0, whose tasks have all completed, from pool
14/11/06 19:35:00 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.04872445 s
14/11/06 19:35:00 INFO spark.SparkContext: Starting job: take at
DStream.scala:608
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Got job 11 (take at
DStream.scala:608) with 1 output partitions (allowLocal=true)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Final stage: Stage 21(take at
DStream.scala:608)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Parents of final stage:
List(Stage 22)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Missing parents: List()
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Submitting Stage 21
(ShuffledRDD[33] at combineByKey at ShuffledDStream.scala:42), which has no
missing parents
14/11/06 19:35:00 INFO storage.MemoryStore: ensureFreeSpace(2280) called
with curMem=43576, maxMem=278302556
14/11/06 19:35:00 INFO storage.MemoryStore: Block broadcast_12 stored as
values in memory (estimated size 2.2 KB, free 265.4 MB)
14/11/06 19:35:00 INFO storage.MemoryStore: ensureFreeSpace(1451) called
with curMem=45856, maxMem=278302556
14/11/06 19:35:00 INFO storage.MemoryStore: Block broadcast_12_piece0 stored
as bytes in memory (estimated size 1451.0 B, free 265.4 MB)
14/11/06 19:35:00 INFO storage.BlockManagerInfo: Added broadcast_12_piece0
in memory on ip-172-31-19-47.us-west-2.compute.internal:33897 (size: 1451.0
B, free: 265.4 MB)
14/11/06 19:35:00 INFO storage.BlockManagerMaster: Updated info of block
broadcast_12_piece0
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
from Stage 21 (ShuffledRDD[33] at combineByKey at ShuffledDStream.scala:42)
14/11/06 19:35:00 INFO cluster.YarnClientClusterScheduler: Adding task set
21.0 with 1 tasks
14/11/06 19:35:00 INFO scheduler.TaskSetManager: Starting task 0.0 in stage
21.0 (TID 80, ip-172-31-24-43.us-west-2.compute.internal, PROCESS_LOCAL,
1034 bytes)
14/11/06 19:35:00 INFO storage.BlockManagerInfo: Added broadcast_12_piece0
in memory on ip-172-31-24-43.us-west-2.compute.internal:51054 (size: 1451.0
B, free: 534.5 MB)
14/11/06 19:35:00 INFO scheduler.TaskSetManager: Finished task 0.0 in stage
21.0 (TID 80) in 38 ms on ip-172-31-24-43.us-west-2.compute.internal (1/1)
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Stage 21 (take at
DStream.scala:608) finished in 0.038 s
14/11/06 19:35:00 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
21.0, whose tasks have all completed, from pool
14/11/06 19:35:00 INFO spark.SparkContext: Job finished: take at
DStream.scala:608, took 0.045527151 s
-------------------------------------------
Time: 1415302500000 ms
-------------------------------------------

14/11/06 19:35:00 INFO scheduler.JobScheduler: Finished job streaming job
1415302500000 ms.0 from job set of time 1415302500000 ms
14/11/06 19:35:00 INFO rdd.ShuffledRDD: Removing RDD 27 from persistence
list
14/11/06 19:35:00 INFO scheduler.JobScheduler: Total delay: 0.108 s for time
1415302500000 ms (execution: 0.099 s)
14/11/06 19:35:00 INFO storage.BlockManager: Removing RDD 27
14/11/06 19:35:00 INFO rdd.MappedRDD: Removing RDD 26 from persistence list
14/11/06 19:35:00 INFO storage.BlockManager: Removing RDD 26
14/11/06 19:35:00 INFO rdd.MappedRDD: Removing RDD 25 from persistence list
14/11/06 19:35:00 INFO storage.BlockManager: Removing RDD 25
14/11/06 19:35:00 INFO rdd.FilteredRDD: Removing RDD 24 from persistence
list
14/11/06 19:35:00 INFO storage.BlockManager: Removing RDD 24
14/11/06 19:35:00 INFO rdd.MappedRDD: Removing RDD 23 from persistence list
14/11/06 19:35:00 INFO storage.BlockManager: Removing RDD 23
14/11/06 19:35:00 INFO rdd.BlockRDD: Removing RDD 22 from persistence list
14/11/06 19:35:00 INFO storage.BlockManager: Removing RDD 22
14/11/06 19:35:00 INFO dstream.PluggableInputDStream: Removing blocks of RDD
BlockRDD[22] at BlockRDD at ReceiverInputDStream.scala:69 of time
1415302500000 ms
14/11/06 19:35:00 WARN scheduler.ReceiverTracker: All of the receivers have
not deregistered, Map(0 ->
ReceiverInfo(0,KinesisReceiver-0,Actor[akka.tcp://sparkExecutor@ip-172-31-24-44.us-west-2.compute.internal:33000/user/Receiver-0-1415302494806#-1634354492],true,ip-172-31-24-44.us-west-2.compute.internal,,))
14/11/06 19:35:00 INFO scheduler.ReceiverTracker: ReceiverTracker stopped
14/11/06 19:35:00 INFO scheduler.JobGenerator: Stopping JobGenerator
immediately
14/11/06 19:35:00 INFO util.RecurringTimer: Stopped timer for JobGenerator
after time 1415302500000
14/11/06 19:35:00 INFO scheduler.JobGenerator: Stopped JobGenerator
14/11/06 19:35:00 INFO scheduler.JobScheduler: Stopped JobScheduler
14/11/06 19:35:00 INFO streaming.StreamingContext: StreamingContext stopped
successfully
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/streaming/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/streaming,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/metrics/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/static,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/executors/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/executors,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/environment/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/environment,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage/rdd,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/storage,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/pool/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/pool,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/stage/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/stage,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages/json,null}
14/11/06 19:35:00 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/stages,null}
14/11/06 19:35:00 INFO ui.SparkUI: Stopped Spark web UI at
http://ip-172-31-19-47.us-west-2.compute.internal:4040
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Stopping DAGScheduler
14/11/06 19:35:00 INFO cluster.YarnClientSchedulerBackend: Shutting down all
executors
14/11/06 19:35:00 INFO cluster.YarnClientSchedulerBackend: Asking each
executor to shut down
14/11/06 19:35:00 INFO scheduler.DAGScheduler: Failed to run runJob at
ReceiverTracker.scala:275
Exception in thread "Thread-42" org.apache.spark.SparkException: Job
cancelled because SparkContext was shut down
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:694)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:693)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
        at
org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:693)
        at
org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1399)
        at
akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:201)
        at
akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163)
        at akka.actor.ActorCell.terminate(ActorCell.scala:338)
        at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431)
        at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
        at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
        at akka.dispatch.Mailbox.run(Mailbox.scala:218)
        at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/11/06 19:35:00 INFO cluster.YarnClientSchedulerBackend: Stopped
14/11/06 19:35:01 INFO network.ConnectionManager: Removing
ReceivingConnection to
ConnectionManagerId(ip-172-31-24-43.us-west-2.compute.internal,51054)
14/11/06 19:35:01 INFO network.ConnectionManager: Key not valid ?
sun.nio.ch.SelectionKeyImpl@3fcc8fc7
14/11/06 19:35:01 INFO network.ConnectionManager: Removing SendingConnection
to ConnectionManagerId(ip-172-31-24-43.us-west-2.compute.internal,51054)
14/11/06 19:35:01 INFO network.ConnectionManager: Removing SendingConnection
to ConnectionManagerId(ip-172-31-24-43.us-west-2.compute.internal,51054)
14/11/06 19:35:01 INFO network.ConnectionManager: key already cancelled ?
sun.nio.ch.SelectionKeyImpl@3fcc8fc7
java.nio.channels.CancelledKeyException
        at
org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386)
        at
org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139)
14/11/06 19:35:01 INFO spark.MapOutputTrackerMasterActor:
MapOutputTrackerActor stopped!
14/11/06 19:35:01 INFO network.ConnectionManager: Selector thread was
interrupted!
14/11/06 19:35:01 INFO network.ConnectionManager: Removing
ReceivingConnection to
ConnectionManagerId(ip-172-31-24-43.us-west-2.compute.internal,51054)
14/11/06 19:35:01 ERROR network.ConnectionManager: Corresponding
SendingConnection to
ConnectionManagerId(ip-172-31-24-43.us-west-2.compute.internal,51054) not
found
14/11/06 19:35:01 INFO network.ConnectionManager: Removing
ReceivingConnection to
ConnectionManagerId(ip-172-31-24-44.us-west-2.compute.internal,34697)
14/11/06 19:35:01 INFO network.ConnectionManager: Removing SendingConnection
to ConnectionManagerId(ip-172-31-24-44.us-west-2.compute.internal,34697)
14/11/06 19:35:01 INFO network.ConnectionManager: Removing SendingConnection
to ConnectionManagerId(ip-172-31-24-44.us-west-2.compute.internal,34697)
14/11/06 19:35:01 WARN network.ConnectionManager: All connections not
cleaned up
14/11/06 19:35:02 INFO network.ConnectionManager: ConnectionManager stopped
14/11/06 19:35:02 INFO storage.MemoryStore: MemoryStore cleared
14/11/06 19:35:02 INFO storage.BlockManager: BlockManager stopped
14/11/06 19:35:02 INFO storage.BlockManagerMaster: BlockManagerMaster
stopped
14/11/06 19:35:02 INFO remote.RemoteActorRefProvider$RemotingTerminator:
Shutting down remote daemon.
14/11/06 19:35:02 INFO remote.RemoteActorRefProvider$RemotingTerminator:
Remote daemon shut down; proceeding with flushing remote transports.
14/11/06 19:35:02 INFO Remoting: Remoting shut down
14/11/06 19:35:02 INFO remote.RemoteActorRefProvider$RemotingTerminator:
Remoting shut down.
14/11/06 19:35:02 INFO spark.SparkContext: Successfully stopped SparkContext
[hadoop@ip-172-31-19-47 ~]$



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kinesis-integration-with-Spark-Streaming-in-EMR-cluster-Output-is-not-showing-up-tp18293.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message