spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: foreachPartition
Date Fri, 30 Oct 2015 23:45:58 GMT
The closure is sent to and executed an Executor, so you need to be looking
at the stdout of the Executors, not on the Driver.

On Fri, Oct 30, 2015 at 4:42 PM, Alex Nastetsky <
alex.nastetsky@vervemobile.com> wrote:

> I'm just trying to do some operation inside foreachPartition, but I can't
> even get a simple println to work. Nothing gets printed.
>
> scala> val a = sc.parallelize(List(1,2,3))
> a: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[2] at parallelize
> at <console>:21
>
> scala> a.foreachPartition(p => println("foo"))
> 2015-10-30 23:38:54,643 INFO  [main] spark.SparkContext
> (Logging.scala:logInfo(59)) - Starting job: foreachPartition at <console>:24
> 2015-10-30 23:38:54,644 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Got job 9
> (foreachPartition at <console>:24) with 3 output partitions
> (allowLocal=false)
> 2015-10-30 23:38:54,644 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Final stage:
> ResultStage 9(foreachPartition at <console>:24)
> 2015-10-30 23:38:54,645 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Parents of final
> stage: List()
> 2015-10-30 23:38:54,645 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Missing parents: List()
> 2015-10-30 23:38:54,646 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Submitting ResultStage
> 9 (ParallelCollectionRDD[2] at parallelize at <console>:21), which has no
> missing parents
> 2015-10-30 23:38:54,648 INFO  [dag-scheduler-event-loop]
> storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(1224)
> called with curMem=14486, maxMem=280496701
> 2015-10-30 23:38:54,649 INFO  [dag-scheduler-event-loop]
> storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_9 stored
> as values in memory (estimated size 1224.0 B, free 267.5 MB)
> 2015-10-30 23:38:54,680 INFO  [dag-scheduler-event-loop]
> storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(871)
> called with curMem=15710, maxMem=280496701
> 2015-10-30 23:38:54,681 INFO  [dag-scheduler-event-loop]
> storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_9_piece0
> stored as bytes in memory (estimated size 871.0 B, free 267.5 MB)
> 2015-10-30 23:38:54,685 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Added broadcast_9_piece0 in memory on
> 10.170.11.94:35814 (size: 871.0 B, free: 267.5 MB)
> 2015-10-30 23:38:54,688 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_4_piece0 on
> 10.170.11.94:35814 in memory (size: 2.0 KB, free: 267.5 MB)
> 2015-10-30 23:38:54,691 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_4_piece0 on
> ip-10-51-144-180.ec2.internal:35111 in memory (size: 2.0 KB, free: 535.0 MB)
> 2015-10-30 23:38:54,691 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_4_piece0 on
> ip-10-51-144-180.ec2.internal:49833 in memory (size: 2.0 KB, free: 535.0 MB)
> 2015-10-30 23:38:54,694 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_4_piece0 on
> ip-10-51-144-180.ec2.internal:34776 in memory (size: 2.0 KB, free: 535.0 MB)
> 2015-10-30 23:38:54,702 INFO  [dag-scheduler-event-loop]
> spark.SparkContext (Logging.scala:logInfo(59)) - Created broadcast 9 from
> broadcast at DAGScheduler.scala:874
> 2015-10-30 23:38:54,703 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Submitting 3 missing
> tasks from ResultStage 9 (ParallelCollectionRDD[2] at parallelize at
> <console>:21)
> 2015-10-30 23:38:54,703 INFO  [dag-scheduler-event-loop]
> cluster.YarnScheduler (Logging.scala:logInfo(59)) - Adding task set 9.0
> with 3 tasks
> 2015-10-30 23:38:54,708 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] scheduler.TaskSetManager
> (Logging.scala:logInfo(59)) - Starting task 0.0 in stage 9.0 (TID 27,
> ip-10-51-144-180.ec2.internal, PROCESS_LOCAL, 1313 bytes)
> 2015-10-30 23:38:54,711 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] scheduler.TaskSetManager
> (Logging.scala:logInfo(59)) - Starting task 1.0 in stage 9.0 (TID 28,
> ip-10-51-144-180.ec2.internal, PROCESS_LOCAL, 1313 bytes)
> 2015-10-30 23:38:54,713 INFO
>  [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_5_piece0 on
> 10.170.11.94:35814 in memory (size: 802.0 B, free: 267.5 MB)
> 2015-10-30 23:38:54,714 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] scheduler.TaskSetManager
> (Logging.scala:logInfo(59)) - Starting task 2.0 in stage 9.0 (TID 29,
> ip-10-51-144-180.ec2.internal, PROCESS_LOCAL, 1313 bytes)
> 2015-10-30 23:38:54,716 INFO
>  [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_5_piece0 on
> ip-10-51-144-180.ec2.internal:34776 in memory (size: 802.0 B, free: 535.0
> MB)
> 2015-10-30 23:38:54,719 INFO
>  [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_5_piece0 on
> ip-10-51-144-180.ec2.internal:35111 in memory (size: 802.0 B, free: 535.0
> MB)
> 2015-10-30 23:38:54,723 INFO
>  [sparkDriver-akka.actor.default-dispatcher-2] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_5_piece0 on
> ip-10-51-144-180.ec2.internal:49833 in memory (size: 802.0 B, free: 535.0
> MB)
> 2015-10-30 23:38:54,743 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_6_piece0 on
> 10.170.11.94:35814 in memory (size: 755.0 B, free: 267.5 MB)
> 2015-10-30 23:38:54,750 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Added broadcast_9_piece0 in memory on
> ip-10-51-144-180.ec2.internal:35111 (size: 871.0 B, free: 535.0 MB)
> 2015-10-30 23:38:54,754 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Added broadcast_9_piece0 in memory on
> ip-10-51-144-180.ec2.internal:34776 (size: 871.0 B, free: 535.0 MB)
> 2015-10-30 23:38:54,756 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Added broadcast_9_piece0 in memory on
> ip-10-51-144-180.ec2.internal:49833 (size: 871.0 B, free: 535.0 MB)
> 2015-10-30 23:38:54,758 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_6_piece0 on
> ip-10-51-144-180.ec2.internal:35111 in memory (size: 755.0 B, free: 535.0
> MB)
> 2015-10-30 23:38:54,759 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_6_piece0 on
> ip-10-51-144-180.ec2.internal:49833 in memory (size: 755.0 B, free: 535.0
> MB)
> 2015-10-30 23:38:54,760 INFO
>  [sparkDriver-akka.actor.default-dispatcher-15] storage.BlockManagerInfo
> (Logging.scala:logInfo(59)) - Removed broadcast_6_piece0 on
> ip-10-51-144-180.ec2.internal:34776 in memory (size: 755.0 B, free: 535.0
> MB)
> 2015-10-30 23:38:54,777 INFO  [task-result-getter-1]
> scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Finished task 1.0 in
> stage 9.0 (TID 28) in 68 ms on ip-10-51-144-180.ec2.internal (1/3)
> 2015-10-30 23:38:54,783 INFO  [task-result-getter-3]
> scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Finished task 2.0 in
> stage 9.0 (TID 29) in 70 ms on ip-10-51-144-180.ec2.internal (2/3)
> 2015-10-30 23:38:54,785 INFO  [task-result-getter-2]
> scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Finished task 0.0 in
> stage 9.0 (TID 27) in 81 ms on ip-10-51-144-180.ec2.internal (3/3)
> 2015-10-30 23:38:54,785 INFO  [task-result-getter-2] cluster.YarnScheduler
> (Logging.scala:logInfo(59)) - Removed TaskSet 9.0, whose tasks have all
> completed, from pool
> 2015-10-30 23:38:54,786 INFO  [dag-scheduler-event-loop]
> scheduler.DAGScheduler (Logging.scala:logInfo(59)) - ResultStage 9
> (foreachPartition at <console>:24) finished in 0.083 s
> 2015-10-30 23:38:54,786 INFO  [main] scheduler.DAGScheduler
> (Logging.scala:logInfo(59)) - Job 9 finished: foreachPartition at
> <console>:24, took 0.143089 s
>
>

Mime
View raw message