spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed
Date Tue, 26 Jul 2016 12:01:38 GMT
Hi,

Anything relevant in ApplicationMaster's log? What about the
executors? You should have 2 (default) so review the logs of each
executors.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Tue, Jul 26, 2016 at 1:17 PM, Ascot Moss <ascot.moss@gmail.com> wrote:
> It is YARN cluster,
>
> /bin/spark-submit \
>
> --conf "spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:+PrintGCTimeStamps
> -XX:+PrintGCDetails" \
>
> --driver-memory 64G \
>
> --executor-memory 16g \
>
>
> On Tue, Jul 26, 2016 at 7:00 PM, Jacek Laskowski <jacek@japila.pl> wrote:
>>
>> Hi,
>>
>> What's the cluster manager? Is this YARN perhaps? Do you have any
>> other apps on the cluster? How do you submit your app? What are the
>> properties?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Tue, Jul 26, 2016 at 1:27 AM, Ascot Moss <ascot.moss@gmail.com> wrote:
>> > Hi,
>> >
>> > spark: 1.6.1
>> > java: java 1.8_u40
>> > I tried random forest training phase, the same code works well if with
>> > 20
>> > trees (lower accuracy, about 68%).  When trying the training phase with
>> > more
>> > tree, I set to 200 trees, it returned:
>> >
>> > "DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651,
>> > took
>> > 19.556700 s Killed" .  There is no WARN or ERROR from console, the task
>> > is
>> > just stopped in the end.
>> >
>> > Any idea how to resolve it? Should the timeout parameter be set to
>> > longer
>> >
>> > regards
>> >
>> >
>> > (below is the log from console)
>> >
>> > 16/07/26 00:02:47 INFO DAGScheduler: looking for newly runnable stages
>> >
>> > 16/07/26 00:02:47 INFO DAGScheduler: running: Set()
>> >
>> > 16/07/26 00:02:47 INFO DAGScheduler: waiting: Set(ResultStage 32)
>> >
>> > 16/07/26 00:02:47 INFO DAGScheduler: failed: Set()
>> >
>> > 16/07/26 00:02:47 INFO DAGScheduler: Submitting ResultStage 32
>> > (MapPartitionsRDD[75] at map at DecisionTree.scala:642), which has no
>> > missing parents
>> >
>> > 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48 stored as values
>> > in
>> > memory (estimated size 2.2 MB, free 18.2 MB)
>> >
>> > 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48_piece0 stored as
>> > bytes in memory (estimated size 436.9 KB, free 18.7 MB)
>> >
>> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
>> > memory
>> > on x.x.x.x:35450 (size: 436.9 KB, free: 45.8 GB)
>> >
>> > 16/07/26 00:02:47 INFO SparkContext: Created broadcast 48 from broadcast
>> > at
>> > DAGScheduler.scala:1006
>> >
>> > 16/07/26 00:02:47 INFO DAGScheduler: Submitting 4 missing tasks from
>> > ResultStage 32 (MapPartitionsRDD[75] at map at DecisionTree.scala:642)
>> >
>> > 16/07/26 00:02:47 INFO TaskSchedulerImpl: Adding task set 32.0 with 4
>> > tasks
>> >
>> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 0.0 in stage 32.0
>> > (TID
>> > 185, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 1.0 in stage 32.0
>> > (TID
>> > 186, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 2.0 in stage 32.0
>> > (TID
>> > 187, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 3.0 in stage 32.0
>> > (TID
>> > 188, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
>> > memory
>> > on x.x.x.x:58784 (size: 436.9 KB, free: 5.1 GB)
>> >
>> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map
>> > output locations for shuffle 12 to x.x.x.x:44434
>> >
>> > 16/07/26 00:02:47 INFO MapOutputTrackerMaster: Size of output statuses
>> > for
>> > shuffle 12 is 180 bytes
>> >
>> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
>> > memory
>> > on x.x.x.x:46186 (size: 436.9 KB, free: 2.2 GB)
>> >
>> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in
>> > memory
>> > on x.x.x.x:50132 (size: 436.9 KB, free: 5.0 GB)
>> >
>> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map
>> > output locations for shuffle 12 to x.x.x.x:47272
>> >
>> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map
>> > output locations for shuffle 12 to x.x.x.x:46802
>> >
>> > 16/07/26 00:02:49 INFO TaskSetManager: Finished task 2.0 in stage 32.0
>> > (TID
>> > 187) in 2265 ms on x.x.x.x (1/4)
>> >
>> > 16/07/26 00:02:49 INFO TaskSetManager: Finished task 1.0 in stage 32.0
>> > (TID
>> > 186) in 2266 ms on x.x.x.x (2/4)
>> >
>> > 16/07/26 00:02:50 INFO TaskSetManager: Finished task 0.0 in stage 32.0
>> > (TID
>> > 185) in 2794 ms on x.x.x.x (3/4)
>> >
>> > 16/07/26 00:02:50 INFO TaskSetManager: Finished task 3.0 in stage 32.0
>> > (TID
>> > 188) in 3738 ms on x.x.x.x (4/4)
>> >
>> > 16/07/26 00:02:50 INFO TaskSchedulerImpl: Removed TaskSet 32.0, whose
>> > tasks
>> > have all completed, from pool
>> >
>> > 16/07/26 00:02:50 INFO DAGScheduler: ResultStage 32 (collectAsMap at
>> > DecisionTree.scala:651) finished in 3.738 s
>> >
>> > 16/07/26 00:02:50 INFO DAGScheduler: Job 19 finished: collectAsMap at
>> > DecisionTree.scala:651, took 19.493917 s
>> >
>> > 16/07/26 00:02:51 INFO MemoryStore: Block broadcast_49 stored as values
>> > in
>> > memory (estimated size 1053.9 KB, free 19.7 MB)
>> >
>> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_49_piece0 stored as
>> > bytes in memory (estimated size 626.7 KB, free 20.3 MB)
>> >
>> > 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_49_piece0 in
>> > memory
>> > on x.x.x.x:35450 (size: 626.7 KB, free: 45.8 GB)
>> >
>> > 16/07/26 00:02:52 INFO SparkContext: Created broadcast 49 from broadcast
>> > at
>> > DecisionTree.scala:601
>> >
>> > 16/07/26 00:02:52 INFO SparkContext: Starting job: collectAsMap at
>> > DecisionTree.scala:651
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Registering RDD 76 (mapPartitions
>> > at
>> > DecisionTree.scala:622)
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Got job 20 (collectAsMap at
>> > DecisionTree.scala:651) with 4 output partitions
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Final stage: ResultStage 34
>> > (collectAsMap at DecisionTree.scala:651)
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Parents of final stage:
>> > List(ShuffleMapStage 33)
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Missing parents:
>> > List(ShuffleMapStage
>> > 33)
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Submitting ShuffleMapStage 33
>> > (MapPartitionsRDD[76] at mapPartitions at DecisionTree.scala:622), which
>> > has
>> > no missing parents
>> >
>> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50 stored as values
>> > in
>> > memory (estimated size 10.0 MB, free 30.3 MB)
>> >
>> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50_piece0 stored as
>> > bytes in memory (estimated size 2.9 MB, free 33.2 MB)
>> >
>> > 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_50_piece0 in
>> > memory
>> > on x.x.x.x:35450 (size: 2.9 MB, free: 45.8 GB)
>> >
>> > 16/07/26 00:02:52 INFO SparkContext: Created broadcast 50 from broadcast
>> > at
>> > DAGScheduler.scala:1006
>> >
>> > 16/07/26 00:02:52 INFO DAGScheduler: Submitting 4 missing tasks from
>> > ShuffleMapStage 33 (MapPartitionsRDD[76] at mapPartitions at
>> > DecisionTree.scala:622)
>> >
>> > 16/07/26 00:02:52 INFO TaskSchedulerImpl: Adding task set 33.0 with 4
>> > tasks
>> >
>> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 1.0 in stage 33.0
>> > (TID
>> > 189, x.x.x.x, partition 1,PROCESS_LOCAL, 2333 bytes)
>> >
>> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 0.0 in stage 33.0
>> > (TID
>> > 190, x.x.x.x, partition 0,PROCESS_LOCAL, 2333 bytes)
>> >
>> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 2.0 in stage 33.0
>> > (TID
>> > 191, x.x.x.x, partition 2,PROCESS_LOCAL, 2333 bytes)
>> >
>> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 3.0 in stage 33.0
>> > (TID
>> > 192, x.x.x.x, partition 3,PROCESS_LOCAL, 2333 bytes)
>> >
>> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in
>> > memory
>> > on x.x.x.x:58784 (size: 2.9 MB, free: 5.0 GB)
>> >
>> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in
>> > memory
>> > on x.x.x.x:58784 (size: 626.7 KB, free: 5.0 GB)
>> >
>> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in
>> > memory
>> > on x.x.x.x:46186 (size: 2.9 MB, free: 2.2 GB)
>> >
>> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in
>> > memory
>> > on x.x.x.x:50132 (size: 2.9 MB, free: 5.0 GB)
>> >
>> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in
>> > memory
>> > on x.x.x.x:46186 (size: 626.7 KB, free: 2.2 GB)
>> >
>> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in
>> > memory
>> > on x.x.x.x:50132 (size: 626.7 KB, free: 5.0 GB)
>> >
>> > 16/07/26 00:02:57 INFO TaskSetManager: Finished task 0.0 in stage 33.0
>> > (TID
>> > 190) in 4212 ms on x.x.x.x (1/4)
>> >
>> > 16/07/26 00:02:57 INFO TaskSetManager: Finished task 1.0 in stage 33.0
>> > (TID
>> > 189) in 4989 ms on x.x.x.x (2/4)
>> >
>> > 16/07/26 00:03:07 INFO TaskSetManager: Finished task 2.0 in stage 33.0
>> > (TID
>> > 191) in 14934 ms on x.x.x.x (3/4)
>> >
>> > 16/07/26 00:03:07 INFO TaskSetManager: Finished task 3.0 in stage 33.0
>> > (TID
>> > 192) in 15172 ms on x.x.x.x (4/4)
>> >
>> > 16/07/26 00:03:07 INFO TaskSchedulerImpl: Removed TaskSet 33.0, whose
>> > tasks
>> > have all completed, from pool
>> >
>> > 16/07/26 00:03:07 INFO DAGScheduler: ShuffleMapStage 33 (mapPartitions
>> > at
>> > DecisionTree.scala:622) finished in 15.173 s
>> >
>> > 16/07/26 00:03:07 INFO DAGScheduler: looking for newly runnable stages
>> >
>> > 16/07/26 00:03:07 INFO DAGScheduler: running: Set()
>> >
>> > 16/07/26 00:03:07 INFO DAGScheduler: waiting: Set(ResultStage 34)
>> >
>> > 16/07/26 00:03:07 INFO DAGScheduler: failed: Set()
>> >
>> > 16/07/26 00:03:07 INFO DAGScheduler: Submitting ResultStage 34
>> > (MapPartitionsRDD[78] at map at DecisionTree.scala:642), which has no
>> > missing parents
>> >
>> > 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51 stored as values
>> > in
>> > memory (estimated size 2.2 MB, free 35.4 MB)
>> >
>> > 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51_piece0 stored as
>> > bytes in memory (estimated size 444.7 KB, free 35.8 MB)
>> >
>> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
>> > memory
>> > on x.x.x.x:35450 (size: 444.7 KB, free: 45.8 GB)
>> >
>> > 16/07/26 00:03:08 INFO SparkContext: Created broadcast 51 from broadcast
>> > at
>> > DAGScheduler.scala:1006
>> >
>> > 16/07/26 00:03:08 INFO DAGScheduler: Submitting 4 missing tasks from
>> > ResultStage 34 (MapPartitionsRDD[78] at map at DecisionTree.scala:642)
>> >
>> > 16/07/26 00:03:08 INFO TaskSchedulerImpl: Adding task set 34.0 with 4
>> > tasks
>> >
>> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 0.0 in stage 34.0
>> > (TID
>> > 193, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 1.0 in stage 34.0
>> > (TID
>> > 194, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 2.0 in stage 34.0
>> > (TID
>> > 195, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 3.0 in stage 34.0
>> > (TID
>> > 196, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes)
>> >
>> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
>> > memory
>> > on x.x.x.x:58784 (size: 444.7 KB, free: 5.0 GB)
>> >
>> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map
>> > output locations for shuffle 13 to x.x.x.x:44434
>> >
>> > 16/07/26 00:03:08 INFO MapOutputTrackerMaster: Size of output statuses
>> > for
>> > shuffle 13 is 180 bytes
>> >
>> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
>> > memory
>> > on x.x.x.x:46186 (size: 444.7 KB, free: 2.2 GB)
>> >
>> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map
>> > output locations for shuffle 13 to x.x.x.x:47272
>> >
>> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in
>> > memory
>> > on x.x.x.x:50132 (size: 444.7 KB, free: 5.0 GB)
>> >
>> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map
>> > output locations for shuffle 13 to x.x.x.x:46802
>> >
>> > 16/07/26 00:03:10 INFO TaskSetManager: Finished task 1.0 in stage 34.0
>> > (TID
>> > 194) in 2240 ms on x.x.x.x (1/4)
>> >
>> > 16/07/26 00:03:10 INFO TaskSetManager: Finished task 0.0 in stage 34.0
>> > (TID
>> > 193) in 2749 ms on x.x.x.x (2/4)
>> >
>> > 16/07/26 00:03:11 INFO TaskSetManager: Finished task 2.0 in stage 34.0
>> > (TID
>> > 195) in 3818 ms on x.x.x.x (3/4)
>> >
>> > 16/07/26 00:03:11 INFO TaskSetManager: Finished task 3.0 in stage 34.0
>> > (TID
>> > 196) in 3901 ms on x.x.x.x (4/4)
>> >
>> > 16/07/26 00:03:11 INFO DAGScheduler: ResultStage 34 (collectAsMap at
>> > DecisionTree.scala:651) finished in 3.902 s
>> >
>> > 16/07/26 00:03:11 INFO TaskSchedulerImpl: Removed TaskSet 34.0, whose
>> > tasks
>> > have all completed, from pool
>> >
>> > 16/07/26 00:03:11 INFO DAGScheduler: Job 20 finished: collectAsMap at
>> > DecisionTree.scala:651, took 19.556700 s
>> >
>> > Killed
>> >
>> >
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message