spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ascot Moss <ascot.m...@gmail.com>
Subject Re: DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed
Date Tue, 26 Jul 2016 11:17:38 GMT
It is YARN cluster,

/bin/spark-submit \

--conf "spark.executor.extraJavaOptions=-XX:+UseG1GC
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails"
\

--driver-memory 64G \

--executor-memory 16g \


On Tue, Jul 26, 2016 at 7:00 PM, Jacek Laskowski <jacek@japila.pl> wrote:

> Hi,
>
> What's the cluster manager? Is this YARN perhaps? Do you have any
> other apps on the cluster? How do you submit your app? What are the
> properties?
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Tue, Jul 26, 2016 at 1:27 AM, Ascot Moss <ascot.moss@gmail.com> wrote:
> > Hi,
> >
> > spark: 1.6.1
> > java: java 1.8_u40
> > I tried random forest training phase, the same code works well if with 20
> > trees (lower accuracy, about 68%).  When trying the training phase with
> more
> > tree, I set to 200 trees, it returned:
> >
> > "DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651,
> took
> > 19.556700 s Killed" .  There is no WARN or ERROR from console, the task
> is
> > just stopped in the end.
> >
> > Any idea how to resolve it? Should the timeout parameter be set to longer
> >
> > regards
> >
> >
> > (below is the log from console)
> >
> > 16/07/26 00:02:47 INFO DAGScheduler: looking for newly runnable stages
> >
> > 16/07/26 00:02:47 INFO DAGScheduler: running: Set()
> >
> > 16/07/26 00:02:47 INFO DAGScheduler: waiting: Set(ResultStage 32)
> >
> > 16/07/26 00:02:47 INFO DAGScheduler: failed: Set()
> >
> > 16/07/26 00:02:47 INFO DAGScheduler: Submitting ResultStage 32
> > (MapPartitionsRDD[75] at map at DecisionTree.scala:642), which has no
> > missing parents
> >
> > 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48 stored as values
> in
> > memory (estimated size 2.2 MB, free 18.2 MB)
> >
> > 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48_piece0 stored as
> > bytes in memory (estimated size 436.9 KB, free 18.7 MB)
> >
> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
> memory
> > on x.x.x.x:35450 (size: 436.9 KB, free: 45.8 GB)
> >
> > 16/07/26 00:02:47 INFO SparkContext: Created broadcast 48 from broadcast
> at
> > DAGScheduler.scala:1006
> >
> > 16/07/26 00:02:47 INFO DAGScheduler: Submitting 4 missing tasks from
> > ResultStage 32 (MapPartitionsRDD[75] at map at DecisionTree.scala:642)
> >
> > 16/07/26 00:02:47 INFO TaskSchedulerImpl: Adding task set 32.0 with 4
> tasks
> >
> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 0.0 in stage 32.0
> (TID
> > 185, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 1.0 in stage 32.0
> (TID
> > 186, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 2.0 in stage 32.0
> (TID
> > 187, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 3.0 in stage 32.0
> (TID
> > 188, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
> memory
> > on x.x.x.x:58784 (size: 436.9 KB, free: 5.1 GB)
> >
> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map
> > output locations for shuffle 12 to x.x.x.x:44434
> >
> > 16/07/26 00:02:47 INFO MapOutputTrackerMaster: Size of output statuses
> for
> > shuffle 12 is 180 bytes
> >
> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
> memory
> > on x.x.x.x:46186 (size: 436.9 KB, free: 2.2 GB)
> >
> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
> memory
> > on x.x.x.x:50132 (size: 436.9 KB, free: 5.0 GB)
> >
> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map
> > output locations for shuffle 12 to x.x.x.x:47272
> >
> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map
> > output locations for shuffle 12 to x.x.x.x:46802
> >
> > 16/07/26 00:02:49 INFO TaskSetManager: Finished task 2.0 in stage 32.0
> (TID
> > 187) in 2265 ms on x.x.x.x (1/4)
> >
> > 16/07/26 00:02:49 INFO TaskSetManager: Finished task 1.0 in stage 32.0
> (TID
> > 186) in 2266 ms on x.x.x.x (2/4)
> >
> > 16/07/26 00:02:50 INFO TaskSetManager: Finished task 0.0 in stage 32.0
> (TID
> > 185) in 2794 ms on x.x.x.x (3/4)
> >
> > 16/07/26 00:02:50 INFO TaskSetManager: Finished task 3.0 in stage 32.0
> (TID
> > 188) in 3738 ms on x.x.x.x (4/4)
> >
> > 16/07/26 00:02:50 INFO TaskSchedulerImpl: Removed TaskSet 32.0, whose
> tasks
> > have all completed, from pool
> >
> > 16/07/26 00:02:50 INFO DAGScheduler: ResultStage 32 (collectAsMap at
> > DecisionTree.scala:651) finished in 3.738 s
> >
> > 16/07/26 00:02:50 INFO DAGScheduler: Job 19 finished: collectAsMap at
> > DecisionTree.scala:651, took 19.493917 s
> >
> > 16/07/26 00:02:51 INFO MemoryStore: Block broadcast_49 stored as values
> in
> > memory (estimated size 1053.9 KB, free 19.7 MB)
> >
> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_49_piece0 stored as
> > bytes in memory (estimated size 626.7 KB, free 20.3 MB)
> >
> > 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_49_piece0 in
> memory
> > on x.x.x.x:35450 (size: 626.7 KB, free: 45.8 GB)
> >
> > 16/07/26 00:02:52 INFO SparkContext: Created broadcast 49 from broadcast
> at
> > DecisionTree.scala:601
> >
> > 16/07/26 00:02:52 INFO SparkContext: Starting job: collectAsMap at
> > DecisionTree.scala:651
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Registering RDD 76 (mapPartitions at
> > DecisionTree.scala:622)
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Got job 20 (collectAsMap at
> > DecisionTree.scala:651) with 4 output partitions
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Final stage: ResultStage 34
> > (collectAsMap at DecisionTree.scala:651)
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Parents of final stage:
> > List(ShuffleMapStage 33)
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Missing parents:
> List(ShuffleMapStage
> > 33)
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Submitting ShuffleMapStage 33
> > (MapPartitionsRDD[76] at mapPartitions at DecisionTree.scala:622), which
> has
> > no missing parents
> >
> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50 stored as values
> in
> > memory (estimated size 10.0 MB, free 30.3 MB)
> >
> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50_piece0 stored as
> > bytes in memory (estimated size 2.9 MB, free 33.2 MB)
> >
> > 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_50_piece0 in
> memory
> > on x.x.x.x:35450 (size: 2.9 MB, free: 45.8 GB)
> >
> > 16/07/26 00:02:52 INFO SparkContext: Created broadcast 50 from broadcast
> at
> > DAGScheduler.scala:1006
> >
> > 16/07/26 00:02:52 INFO DAGScheduler: Submitting 4 missing tasks from
> > ShuffleMapStage 33 (MapPartitionsRDD[76] at mapPartitions at
> > DecisionTree.scala:622)
> >
> > 16/07/26 00:02:52 INFO TaskSchedulerImpl: Adding task set 33.0 with 4
> tasks
> >
> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 1.0 in stage 33.0
> (TID
> > 189, x.x.x.x, partition 1,PROCESS_LOCAL, 2333 bytes)
> >
> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 0.0 in stage 33.0
> (TID
> > 190, x.x.x.x, partition 0,PROCESS_LOCAL, 2333 bytes)
> >
> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 2.0 in stage 33.0
> (TID
> > 191, x.x.x.x, partition 2,PROCESS_LOCAL, 2333 bytes)
> >
> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 3.0 in stage 33.0
> (TID
> > 192, x.x.x.x, partition 3,PROCESS_LOCAL, 2333 bytes)
> >
> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in
> memory
> > on x.x.x.x:58784 (size: 2.9 MB, free: 5.0 GB)
> >
> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in
> memory
> > on x.x.x.x:58784 (size: 626.7 KB, free: 5.0 GB)
> >
> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in
> memory
> > on x.x.x.x:46186 (size: 2.9 MB, free: 2.2 GB)
> >
> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in
> memory
> > on x.x.x.x:50132 (size: 2.9 MB, free: 5.0 GB)
> >
> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in
> memory
> > on x.x.x.x:46186 (size: 626.7 KB, free: 2.2 GB)
> >
> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in
> memory
> > on x.x.x.x:50132 (size: 626.7 KB, free: 5.0 GB)
> >
> > 16/07/26 00:02:57 INFO TaskSetManager: Finished task 0.0 in stage 33.0
> (TID
> > 190) in 4212 ms on x.x.x.x (1/4)
> >
> > 16/07/26 00:02:57 INFO TaskSetManager: Finished task 1.0 in stage 33.0
> (TID
> > 189) in 4989 ms on x.x.x.x (2/4)
> >
> > 16/07/26 00:03:07 INFO TaskSetManager: Finished task 2.0 in stage 33.0
> (TID
> > 191) in 14934 ms on x.x.x.x (3/4)
> >
> > 16/07/26 00:03:07 INFO TaskSetManager: Finished task 3.0 in stage 33.0
> (TID
> > 192) in 15172 ms on x.x.x.x (4/4)
> >
> > 16/07/26 00:03:07 INFO TaskSchedulerImpl: Removed TaskSet 33.0, whose
> tasks
> > have all completed, from pool
> >
> > 16/07/26 00:03:07 INFO DAGScheduler: ShuffleMapStage 33 (mapPartitions at
> > DecisionTree.scala:622) finished in 15.173 s
> >
> > 16/07/26 00:03:07 INFO DAGScheduler: looking for newly runnable stages
> >
> > 16/07/26 00:03:07 INFO DAGScheduler: running: Set()
> >
> > 16/07/26 00:03:07 INFO DAGScheduler: waiting: Set(ResultStage 34)
> >
> > 16/07/26 00:03:07 INFO DAGScheduler: failed: Set()
> >
> > 16/07/26 00:03:07 INFO DAGScheduler: Submitting ResultStage 34
> > (MapPartitionsRDD[78] at map at DecisionTree.scala:642), which has no
> > missing parents
> >
> > 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51 stored as values
> in
> > memory (estimated size 2.2 MB, free 35.4 MB)
> >
> > 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51_piece0 stored as
> > bytes in memory (estimated size 444.7 KB, free 35.8 MB)
> >
> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
> memory
> > on x.x.x.x:35450 (size: 444.7 KB, free: 45.8 GB)
> >
> > 16/07/26 00:03:08 INFO SparkContext: Created broadcast 51 from broadcast
> at
> > DAGScheduler.scala:1006
> >
> > 16/07/26 00:03:08 INFO DAGScheduler: Submitting 4 missing tasks from
> > ResultStage 34 (MapPartitionsRDD[78] at map at DecisionTree.scala:642)
> >
> > 16/07/26 00:03:08 INFO TaskSchedulerImpl: Adding task set 34.0 with 4
> tasks
> >
> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 0.0 in stage 34.0
> (TID
> > 193, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 1.0 in stage 34.0
> (TID
> > 194, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 2.0 in stage 34.0
> (TID
> > 195, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 3.0 in stage 34.0
> (TID
> > 196, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes)
> >
> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
> memory
> > on x.x.x.x:58784 (size: 444.7 KB, free: 5.0 GB)
> >
> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map
> > output locations for shuffle 13 to x.x.x.x:44434
> >
> > 16/07/26 00:03:08 INFO MapOutputTrackerMaster: Size of output statuses
> for
> > shuffle 13 is 180 bytes
> >
> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
> memory
> > on x.x.x.x:46186 (size: 444.7 KB, free: 2.2 GB)
> >
> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map
> > output locations for shuffle 13 to x.x.x.x:47272
> >
> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
> memory
> > on x.x.x.x:50132 (size: 444.7 KB, free: 5.0 GB)
> >
> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map
> > output locations for shuffle 13 to x.x.x.x:46802
> >
> > 16/07/26 00:03:10 INFO TaskSetManager: Finished task 1.0 in stage 34.0
> (TID
> > 194) in 2240 ms on x.x.x.x (1/4)
> >
> > 16/07/26 00:03:10 INFO TaskSetManager: Finished task 0.0 in stage 34.0
> (TID
> > 193) in 2749 ms on x.x.x.x (2/4)
> >
> > 16/07/26 00:03:11 INFO TaskSetManager: Finished task 2.0 in stage 34.0
> (TID
> > 195) in 3818 ms on x.x.x.x (3/4)
> >
> > 16/07/26 00:03:11 INFO TaskSetManager: Finished task 3.0 in stage 34.0
> (TID
> > 196) in 3901 ms on x.x.x.x (4/4)
> >
> > 16/07/26 00:03:11 INFO DAGScheduler: ResultStage 34 (collectAsMap at
> > DecisionTree.scala:651) finished in 3.902 s
> >
> > 16/07/26 00:03:11 INFO TaskSchedulerImpl: Removed TaskSet 34.0, whose
> tasks
> > have all completed, from pool
> >
> > 16/07/26 00:03:11 INFO DAGScheduler: Job 20 finished: collectAsMap at
> > DecisionTree.scala:651, took 19.556700 s
> >
> > Killed
> >
> >
> >
> >
>

Mime
View raw message