spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: kafkaDirectStream usage error
Date Fri, 05 Feb 2016 16:44:50 GMT
2 things:

- you're only attempting to read from a single TopicAndPartition.  Since
your topic has multiple partitions, this probably isn't what you want

- you're getting an offset out of range exception because the offset you're
asking for doesn't exist in kafka.

Use the other createDirectStream method (the one that takes a set of
topics, not a map of topicpartitions to offset), at least until you get an
understanding for how things work.



On Thu, Feb 4, 2016 at 7:58 PM, Diwakar Dhanuskodi <
diwakar.dhanuskodi@gmail.com> wrote:

> I am  using  below  directsream to consume  messages from kafka . Topic
> has 8 partitions.
>
>  val topicAndPart =  OffsetRange.create("request5",0,
> 1,100000).topicAndPartition()
>     val fromOffsets =
> Map[kafka.common.TopicAndPartition,Long](topicAndPart->0)
>     val messageHandler = (mmd : MessageAndMetadata[String,String]) =>
> (mmd.key(),mmd.message())
>     val k1 =
> KafkaUtils.createDirectStream[String,String,StringDecoder,StringDecoder,(String,String)](ssc,
> kafkaParams, fromOffsets,messageHandler)
>
> I am  getting  below  error . Any  idea  where  I am doing  wrong  .
> Please  help .
>
> 6/02/04 21:04:38 WARN scheduler.TaskSetManager: Lost task 0.1 in stage 0.0
> (TID 2, datanode3.isdp.com): UnknownReason
> 16/02/04 21:04:38 INFO scheduler.TaskSetManager: Starting task 0.2 in
> stage 0.0 (TID 3, datanode3.isdp.com, RACK_LOCAL, 2273 bytes)
> 16/02/04 21:04:38 WARN spark.ThrowableSerializationWrapper: Task exception
> could not be deserialized
> java.lang.ClassNotFoundException: kafka.common.OffsetOutOfRangeException
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:278)
>     at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>     at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>     at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>     at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 16/02/04 21:04:38 ERROR scheduler.TaskResultGetter: Could not deserialize
> TaskEndReason: ClassNotFound with classloader
> org.apache.spark.util.MutableURLClassLoader@7202dc8c
> 16/02/04 21:04:38 WARN scheduler.TaskSetManager: Lost task 0.2 in stage
> 0.0 (TID 3, datanode3.isdp.com): UnknownReason
> 16/02/04 21:04:38 INFO scheduler.TaskSetManager: Starting task 0.3 in
> stage 0.0 (TID 4, datanode3.isdp.com, RACK_LOCAL, 2273 bytes)
> 16/02/04 21:04:38 WARN spark.ThrowableSerializationWrapper: Task exception
> could not be deserialized
> java.lang.ClassNotFoundException: kafka.common.OffsetOutOfRangeException
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:278)
>     at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>     at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>     at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>     at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 16/02/04 21:04:38 ERROR scheduler.TaskResultGetter: Could not deserialize
> TaskEndReason: ClassNotFound with classloader
> org.apache.spark.util.MutableURLClassLoader@7202dc8c
> 16/02/04 21:04:38 WARN scheduler.TaskSetManager: Lost task 0.3 in stage
> 0.0 (TID 4, datanode3.isdp.com): UnknownReason
> 16/02/04 21:04:38 ERROR scheduler.TaskSetManager: Task 0 in stage 0.0
> failed 4 times; aborting job
> 16/02/04 21:04:38 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose
> tasks have all completed, from pool
> 16/02/04 21:04:38 INFO cluster.YarnScheduler: Cancelling stage 0
> 16/02/04 21:04:38 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (count at
> SparkReceiver.scala:71) failed in 6.213 s
> 16/02/04 21:04:38 INFO scheduler.DAGScheduler: Job 0 failed: print at
> SparkReceiver.scala:71, took 6.725788 s
> 16/02/04 21:04:38 INFO scheduler.JobScheduler: Finished job streaming job
> 1454600072000 ms.0 from job set of time 1454600072000 ms
> 16/02/04 21:04:38 INFO scheduler.JobScheduler: Starting job streaming job
> 1454600072000 ms.1 from job set of time 1454600072000 ms
> 16/02/04 21:04:38 ERROR scheduler.JobScheduler: Error running job
> streaming job 1454600072000 ms.0
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
> in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage
> 0.0 (TID 4, datanode3.isdp.com): UnknownReason
> Driver stacktrace:
>     at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
>     at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>     at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>     at scala.Option.foreach(Option.scala:236)
>     at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>     at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
>     at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
>     at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
>     at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>     at
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
>     at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1298)
>     at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>     at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>     at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>     at org.apache.spark.rdd.RDD.take(RDD.scala:1272)
>     at
> org.apache.spark.streaming.dstream.DStream$$anonfun$print$2$$anonfun$foreachFunc$5$1.apply(DStream.scala:722)
>     at
> org.apache.spark.streaming.dstream.DStream$$anonfun$print$2$$anonfun$foreachFunc$5$1.apply(DStream.scala:721)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:42)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>     at scala.util.Try$.apply(Try.scala:161)
>     at org.apache.spark.streaming.scheduler.Job.run(Job.scala:34)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:218)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:218)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:218)
>     at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:217)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
> failure: Lost task 0.3 in stage 0.0 (TID 4, datanode3.isdp.com):
> UnknownReason
> Driver stacktrace:
>     at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
>     at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>     at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>     at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>     at scala.Option.foreach(Option.scala:236)
>     at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>     at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
>     at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
>     at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
>     at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>     at
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
>     at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1298)
>     at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>     at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>     at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>     at org.apache.spark.rdd.RDD.take(RDD.scala:1272)
>     at
> org.apache.spark.streaming.dstream.DStream$$anonfun$print$2$$anonfun$foreachFunc$5$1.apply(DStream.scala:722)
>     at
> org.apache.spark.streaming.dstream.DStream$$anonfun$print$2$$anonfun$foreachFunc$5$1.apply(DStream.scala:721)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:42)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>     at scala.util.Try$.apply(Try.scala:161)
>     at org.apache.spark.streaming.scheduler.Job.run(Job.scala:34)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:218)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:218)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:218)
>     at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:217)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 16/02/04 21:04:38 INFO streaming.StreamingContext: Invoking
> stop(stopGracefully=false) from shutdown hook
> 16/02/04 21:04:38 INFO scheduler.JobGenerator: Stopping JobGenerator
> immediately
> 16/02/04 21:04:39 INFO util.RecurringTimer: Stopped timer for JobGenerator
> after time 1454600078000
> 16/02/04 21:04:39 INFO scheduler.JobGenerator: Stopped JobGenerator
> 16/02/04 21:04:39 INFO spark.SparkContext: Starting job: foreachRDD at
> SparkReceiver.scala:74
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Registering RDD 30
> (foreachRDD at SparkReceiver.scala:74)
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Got job 1 (foreachRDD at
> SparkReceiver.scala:74) with 1 output partitions
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Final stage: ResultStage
> 3(foreachRDD at SparkReceiver.scala:74)
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Parents of final stage:
> List(ShuffleMapStage 2)
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Missing parents:
> List(ShuffleMapStage 2)
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage
> 2 (MapPartitionsRDD[30] at foreachRDD at SparkReceiver.scala:74), which has
> no missing parents
> 16/02/04 21:04:39 INFO storage.MemoryStore: ensureFreeSpace(26608) called
> with curMem=6900, maxMem=1111794647
> 16/02/04 21:04:39 INFO storage.MemoryStore: Block broadcast_1 stored as
> values in memory (estimated size 26.0 KB, free 1060.3 MB)
> 16/02/04 21:04:39 INFO storage.MemoryStore: ensureFreeSpace(9951) called
> with curMem=33508, maxMem=1111794647
> 16/02/04 21:04:39 INFO storage.MemoryStore: Block broadcast_1_piece0
> stored as bytes in memory (estimated size 9.7 KB, free 1060.2 MB)
> 16/02/04 21:04:39 INFO storage.BlockManagerInfo: Added broadcast_1_piece0
> in memory on 10.132.117.208:39732 (size: 9.7 KB, free: 1060.3 MB)
> 16/02/04 21:04:39 INFO spark.SparkContext: Created broadcast 1 from
> broadcast at DAGScheduler.scala:861
> 16/02/04 21:04:39 INFO scheduler.DAGScheduler: Submitting 1 missing tasks
> from ShuffleMapStage 2 (MapPartitionsRDD[30] at foreachRDD at
> SparkReceiver.scala:74)
> 16/02/04 21:04:39 INFO cluster.YarnScheduler: Adding task set 2.0 with 1
> tasks
> 16/02/04 21:04:39 INFO scheduler.TaskSetManager: Starting task 0.0 in
> stage 2.0 (TID 5, datanode3.isdp.com, RACK_LOCAL, 2164 bytes)
> 16/02/04 21:04:39 INFO storage.BlockManagerInfo: Added broadcast_1_piece0
> in memory on datanode3.isdp.com:42736 (size: 9.7 KB, free: 530.3 MB)
> 16/02/04 21:04:40 WARN spark.ThrowableSerializationWrapper: Task exception
> could not be deserialized
> java.lang.ClassNotFoundException: kafka.common.OffsetOutOfRangeException
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:278)
>     at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>     at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>     at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>     at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 16/02/04 21:04:40 ERROR scheduler.TaskResultGetter: Could not deserialize
> TaskEndReason: ClassNotFound with classloader
> org.apache.spark.util.MutableURLClassLoader@7202dc8c
> 16/02/04 21:04:40 WARN scheduler.TaskSetManager: Lost task 0.0 in stage
> 2.0 (TID 5, datanode3.isdp.com): UnknownReason
> 16/02/04 21:04:40 INFO scheduler.TaskSetManager: Starting task 0.1 in
> stage 2.0 (TID 6, datanode3.isdp.com, RACK_LOCAL, 2164 bytes)
> 16/02/04 21:04:40 WARN spark.ThrowableSerializationWrapper: Task exception
> could not be deserialized
> java.lang.ClassNotFoundException: kafka.common.OffsetOutOfRangeException
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:278)
>     at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>     at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>     at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>     at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 16/02/04 21:04:40 ERROR scheduler.TaskResultGetter: Could not deserialize
> TaskEndReason: ClassNotFound with classloader
> org.apache.spark.util.MutableURLClassLoader@7202dc8c
> 16/02/04 21:04:40 WARN scheduler.TaskSetManager: Lost task 0.1 in stage
> 2.0 (TID 6, datanode3.isdp.com): UnknownReason
> 16/02/04 21:04:40 INFO scheduler.TaskSetManager: Starting task 0.2 in
> stage 2.0 (TID 7, datanode3.isdp.com, RACK_LOCAL, 2164 bytes)
> 16/02/04 21:04:40 WARN spark.ThrowableSerializationWrapper: Task exception
> could not be deserialized
> java.lang.ClassNotFoundException: kafka.common.OffsetOutOfRangeException
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:278)
>     at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>     at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>     at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>     at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>     at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>     at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>     at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>     at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>     at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>     at
> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> 16/02/04 21:04:40 ERROR scheduler.TaskResultGetter: Could not deserialize
> TaskEndReason: ClassNotFound with classloader
> org.apache.spark.util.MutableURLClassLoader@7202dc8c
> 16/02/04 21:04:40 WARN scheduler.TaskSetManager: Lost task 0.2 in stage
> 2.0 (TID 7, datanode3.isdp.com): UnknownReason
> 16/02/04 21:04:40 INFO scheduler.TaskSetManager: Starting task 0.3 in
> stage 2.0 (TID 8, datanode3.isdp.com, RACK_LOCAL, 2164 bytes)
> 16/02/04 21:04:40 INFO storage.BlockManagerInfo: Added broadcast_1_piece0
> in memory on datanode3.isdp.com:42221 (size: 9.7 KB, free: 530.3 MB)
> 16/02/04 21:04:41 INFO scheduler.JobScheduler: Stopped JobScheduler
> Exception in thread "streaming-job-executor-0" java.lang.Error:
> java.lang.InterruptedException
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.InterruptedException
>     at java.lang.Object.wait(Native Method)
>     at java.lang.Object.wait(Object.java:503)
>     at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
>     at
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:559)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
>     at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
>     at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>     at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>     at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>     at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
>     at
> org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:177)
>     at
> org.apache.spark.sql.DataFrame$$anonfun$collect$1.apply(DataFrame.scala:1385)
>     at
> org.apache.spark.sql.DataFrame$$anonfun$collect$1.apply(DataFrame.scala:1385)
>     at
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
>     at
> org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:1903)
>     at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1384)
>     at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1402)
>     at
> com.tcs.dime.spark.SparkReceiver$$anonfun$main$1.apply(SparkReceiver.scala:148)
>     at
> com.tcs.dime.spark.SparkReceiver$$anonfun$main$1.apply(SparkReceiver.scala:74)
>     at
> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:631)
>     at
> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:631)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:42)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>     at
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>     at scala.util.Try$.apply(Try.scala:161)
>     at org.apache.spark.streaming.scheduler.Job.run(Job.scala:34)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:218)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:218)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:218)
>     at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>     at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:217)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     ... 2 more
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/streaming,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/streaming/batch,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/static/streaming,null}
> 16/02/04 21:04:41 INFO streaming.StreamingContext: StreamingContext
> stopped successfully
> 16/02/04 21:04:41 INFO spark.SparkContext: Invoking stop() from shutdown
> hook
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/streaming/batch/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/streaming/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/static/sql,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL/execution/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL/execution,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/SQL,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/metrics/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/api,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/static,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors/threadDump,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/executors,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/environment/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/environment,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/storage/rdd,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/storage/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/storage,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/pool/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/pool,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/stage/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/stage,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/stages,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/jobs/job/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/jobs/job,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/jobs/json,null}
> 16/02/04 21:04:41 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/jobs,null}
> 16/02/04 21:04:41 INFO storage.BlockManagerInfo: Removed
> broadcast_0_piece0 on 10.132.117.208:39732 in memory (size: 2.4 KB, free:
> 1060.3 MB)
> 16/02/04 21:04:41 INFO storage.BlockManagerInfo: Removed
> broadcast_0_piece0 on datanode3.isdp.com:42221 in memory (size: 2.4 KB,
> free: 530.3 MB)
> 16/02/04 21:04:41 INFO storage.BlockManagerInfo: Removed
> broadcast_0_piece0 on datanode3.isdp.com:42736 in memory (size: 2.4 KB,
> free: 530.3 MB)
> 16/02/04 21:04:41 INFO spark.ContextCleaner: Cleaned accumulator 1
> 16/02/04 21:04:41 INFO ui.SparkUI: Stopped Spark web UI at
> http://10.132.117.208:4040
> 16/02/04 21:04:41 INFO scheduler.DAGScheduler: Stopping DAGScheduler
> 16/02/04 21:04:41 INFO scheduler.DAGScheduler: ShuffleMapStage 2
> (foreachRDD at SparkReceiver.scala:74) failed in 1.512 s
> 16/02/04 21:04:41 INFO cluster.YarnClientSchedulerBackend: Shutting down
> all executors
> 16/02/04 21:04:41 INFO cluster.YarnClientSchedulerBackend: Interrupting
> monitor thread
> 16/02/04 21:04:41 INFO cluster.YarnClientSchedulerBackend: Asking each
> executor to shut down
> 16/02/04 21:04:41 INFO cluster.YarnClientSchedulerBackend: Stopped
> 16/02/04 21:04:41 WARN spark.ExecutorAllocationManager: No stages are
> running, but numRunningTasks != 0
> 16/02/04 21:04:41 INFO spark.MapOutputTrackerMasterEndpoint:
> MapOutputTrackerMasterEndpoint stopped!
> 16/02/04 21:04:41 INFO storage.MemoryStore: MemoryStore cleared
> 16/02/04 21:04:41 INFO storage.BlockManager: BlockManager stopped
> 16/02/04 21:04:41 INFO storage.BlockManagerMaster: BlockManagerMaster
> stopped
> 16/02/04 21:04:41 INFO spark.SparkContext: Successfully stopped
> SparkContext
> 16/02/04 21:04:41 INFO
> scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
> OutputCommitCoordinator stopped!
> 16/02/04 21:04:41 INFO util.ShutdownHookManager: Shutdown hook called
> 16/02/04 21:04:41 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-2014834b-fb03-4418-99e0-f35171acc67e
> 16/02/04 21:04:41 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-9afdd351-d7a6-44cc-a47f-cce3170293e9
> 16/02/04 21:04:41 INFO remote.RemoteActorRefProvider$RemotingTerminator:
> Shutting down remote daemon.
> 16/02/04 21:04:41 INFO remote.RemoteActorRefProvider$RemotingTerminator:
> Remote daemon shut down; proceeding with flushing remote transports.
> [root@Datanode4 bin]#
>
>
> //    val k  = KafkaUtils.createStream(ssc, "datanode4.isdp.com:2181","DIME",topicMap
> ,StorageLevel.MEMORY_ONLY)
> //Changes for DirectStream - START
>     val brokers = "datanode4.isdp.com:9092"+","+"datanode5.isdp.com:9093"
>     val topicSet = topics.split(",").toSet
>     val kafkaParams = Map[String,String]("bootstrap.servers" -> "
> datanode4.isdp.com:9092")
>     //when executing DirectStream , comment out createStream statement
>     val k =
> KafkaUtils.createDirectStream[String,String,StringDecoder,StringDecoder](ssc,
> kafkaParams, topicSet)
>     val topicAndPart =  OffsetRange.create("request5",0,
> 1,100000).topicAndPartition()
>     val fromOffsets =
> Map[kafka.common.TopicAndPartition,Long](topicAndPart->0)
>     val messageHandler = (mmd : MessageAndMetadata[String,String]) =>
> (mmd.key(),mmd.message())
>     val k1 =
> KafkaUtils.createDirectStream[String,String,StringDecoder,StringDecoder,(String,String)](ssc,
> kafkaParams, fromOffsets,messageHandler)
>
> ./spark-submit --master yarn-client  --conf
> "spark.dynamicAllocation.enabled=true" --conf
> "spark.shuffle.service.enabled=true"  --conf
> "spark.sql.tungsten.enabled=false" --conf "spark.sql.codegen=false" --conf
> "spark.sql.unsafe.enabled=false" --conf
> "spark.streaming.backpressure.enabled=true" --conf "spark.locality.wait=1s"
> --conf "spark.shuffle.consolidateFiles=true"  --conf
> "spark.streaming.kafka.maxRatePerPartition=1000000" --driver-memory 2g
> --executor-memory 1g --class com.tcs.dime.spark.SparkReceiver   --files
> /etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml,/etc/hadoop/conf/mapred-site.xml,/etc/hadoop/conf/yarn-site.xml,/etc/hive/conf/hive-site.xml
> --jars
> /root/Downloads/dime/jars/spark-streaming-kafka-assembly_2.10-1.5.1.jar,/root/Desktop/Jars/sparkreceiver.jar
> /root/Desktop/Jars/sparkreceiver.jar > log.txt
>
> ./spark-submit --master yarn --deploy-mode client  --num-executors 10
> --executor-cores 1  --conf
> "spark.streaming.kafka.maxRatePerPartition=10000000" --conf
> "spark.streaming.blockInterval=50ms" --conf
> "spark.ui.showConsoleProgress=false" --conf
> "spark.sql.tungsten.enabled=false" --conf "spark.sql.codegen=false" --conf
> "spark.sql.unsafe.enabled=false" --conf
> "spark.streaming.backpressure.enabled=true" --conf "spark.locality.wait=1s"
> --conf "spark.shuffle.consolidateFiles=true"   --driver-memory 2g
> --executor-memory 1g --class com.tcs.dime.spark.SparkReceiver   --files
> /etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml,/etc/hadoop/conf/mapred-site.xml,/etc/hadoop/conf/yarn-site.xml,/etc/hive/conf/hive-site.xml
> --jars
> hdfs://Namenode:8020/user/root/kafka-assembly.jar,hdfs://Namenode:8020/user/root/sparkreceiver.jar
> /root/Desktop/Jars/sparkreceiver.jar  &> log.txt
>
>
> Regards
> Diwakar Dhanuskodi
> Technical Architect Hadoop ,
> Analytics Bigdata  Information Management ,
> Digital   Enterprise Unit ,
> Tata Consultancy Services Limited
> Unit-VIII, Think Campus,
> KIADB Industrial Estate,
> Electronics City, Phase-II,
> Bangalore - 560100,Karnataka
> India
> Cell:- +91-8884066464
> Mailto: d.diwakar@tcs.com
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty. IT Services
> Business Solutions
> Consulting
> ____________________________________________
> 02/04/2016 09:59PM
> Next
> HomeTopLog Out Help
>
> Sent from Samsung Mobile.
>

Mime
View raw message