spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeshi Yamamuro <linguin....@gmail.com>
Subject Re: GC overhead limit exceeded
Date Mon, 16 May 2016 15:21:24 GMT
To understand the issue, you need to describe more about your case;
what's the version of spark you use and what's your job?
Also, what if you directly use scala interfaces instead of python ones?

On Mon, May 16, 2016 at 11:56 PM, Aleksandr Modestov <
aleksandrmodestov@gmail.com> wrote:

> Hi,
>
> "Why did you though you have enough memory for your task? You checked task
> statistics in your WebUI?". I mean that I have jnly about 5Gb data but
> spark.driver memory in 60Gb. I check task statistics in web UI.
> But really spark says that
> *"05-16 17:50:06.254 127.0.0.1:54321 <http://127.0.0.1:54321>       1534
> #e Thread WARN: Swapping!  GC CALLBACK, (K/V:29.74 GB + POJO:18.97 GB +
> FREE:8.79 GB == MEM_MAX:57.50 GB), desiredKV=7.19 GB OOM!Exception in
> thread "Heartbeat" java.lang.OutOfMemoryError: Java heap space"*
> But why spark doesn't split data into a disk?
>
> On Mon, May 16, 2016 at 5:11 PM, Takeshi Yamamuro <linguin.m.s@gmail.com>
> wrote:
>
>> Hi,
>>
>> Why did you though you have enough memory for your task? You checked task
>> statistics in your WebUI?
>> Anyway, If you get stuck with the GC issue, you'd better off increasing
>> the number of partitions.
>>
>> // maropu
>>
>> On Mon, May 16, 2016 at 10:00 PM, AlexModestov <
>> AleksandrModestov@gmail.com> wrote:
>>
>>> I get the error in the apache spark...
>>>
>>> "spark.driver.memory 60g
>>> spark.python.worker.memory 60g
>>> spark.master local[*]"
>>>
>>> The amount of data is about 5Gb, but spark says that "GC overhead limit
>>> exceeded". I guess that my conf-file gives enought resources.
>>>
>>> "16/05/16 15:13:02 WARN NettyRpcEndpointRef: Error sending message
>>> [message
>>> = Heartbeat(driver,[Lscala.Tuple2;@87576f9,BlockManagerId(driver,
>>> localhost,
>>> 59407))] in 1 attempts
>>> org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10
>>> seconds]. This timeout is controlled by spark.executor.heartbeatInterval
>>>         at
>>> org.apache.spark.rpc.RpcTimeout.org
>>> $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
>>>         at
>>>
>>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
>>>         at
>>>
>>> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>>>         at
>>>
>>> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
>>>         at
>>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
>>>         at
>>>
>>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
>>>         at
>>> org.apache.spark.executor.Executor.org
>>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
>>>         at
>>>
>>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
>>>         at
>>>
>>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>>         at
>>>
>>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>>         at
>>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
>>>         at
>>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
>>>         at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>         at
>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>         at
>>>
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>         at
>>>
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
>>> [10 seconds]
>>>         at
>>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>>         at
>>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>>         at
>>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>>>         at
>>>
>>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>>         at scala.concurrent.Await$.result(package.scala:107)
>>>         at
>>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
>>>         ... 14 more
>>> 16/05/16 15:13:02 WARN NettyRpcEnv: Ignored message:
>>> HeartbeatResponse(false)
>>> 05-16 15:13:26.398 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.74 GB + FREE:11.03 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:13:44.528 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.86 GB + FREE:10.90 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:13:56.847 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.88 GB + FREE:10.88 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:14:10.215 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.90 GB + FREE:10.86 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:14:33.622 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.91 GB + FREE:10.85 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:14:47.075 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:15:10.555 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.92 GB + FREE:10.84 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:15:25.520 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> 05-16 15:15:39.087 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:16.93 GB + FREE:10.84 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=7.19 GB OOM!
>>> Exception in thread "HashSessionScavenger-0" java.lang.OutOfMemoryError:
>>> GC
>>> overhead limit exceeded
>>>         at
>>>
>>> java.util.concurrent.ConcurrentHashMap$ValuesView.iterator(ConcurrentHashMap.java:4683)
>>>         at
>>>
>>> org.eclipse.jetty.server.session.HashSessionManager.scavenge(HashSessionManager.java:314)
>>>         at
>>>
>>> org.eclipse.jetty.server.session.HashSessionManager$2.run(HashSessionManager.java:285)
>>>         at java.util.TimerThread.mainLoop(Timer.java:555)
>>>         at java.util.TimerThread.run(Timer.java:505)
>>> 16/05/16 15:22:26 ERROR Executor: Exception in task 0.0 in stage 10.0
>>> (TID
>>> 107)
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>         at java.lang.Double.valueOf(Double.java:519)
>>>         at scala.runtime.BoxesRunTime.boxToDouble(BoxesRunTime.java:84)
>>>         at
>>>
>>> org.apache.spark.sql.catalyst.expressions.MutableRow.setDouble(rows.scala:176)
>>>         at
>>>
>>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
>>> Source)
>>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>>         at
>>> scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
>>>         at
>>> org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:665)
>>>         at
>>> org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1391)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1388)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>>         at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>>         at
>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>>         at
>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>>         at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 05-16 15:22:26.947 127.0.0.1:54321       2059   #e Thread WARN: Unblock
>>> allocations; cache below desired, but also OOM: GC CALLBACK, (K/V:29.74
>>> GB +
>>> POJO:16.93 GB + FREE:10.83 GB == MEM_MAX:57.50 GB), desiredKV=38.52 GB
>>> OOM!
>>> 05-16 15:22:26.948 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:14.94 GB + FREE:12.83 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=8.65 GB OOM!
>>> 16/05/16 15:22:26 WARN HeartbeatReceiver: Removing executor driver with
>>> no
>>> recent heartbeats: 144662 ms exceeds timeout 120000 ms
>>> 16/05/16 15:22:26 ERROR ActorSystemImpl: exception on LARS’ timer thread
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>         at
>>> akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:22)
>>>         at
>>>
>>> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:443)
>>>         at
>>>
>>> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
>>>         at
>>> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 16/05/16 15:22:26 INFO ActorSystemImpl: starting new LARS thread
>>> 16/05/16 15:22:26 ERROR TaskSchedulerImpl: Lost executor driver on
>>> localhost: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 3.0 in stage 10.0 (TID
>>> 110,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 ERROR TaskSetManager: Task 3 in stage 10.0 failed 1
>>> times;
>>> aborting job
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 6.0 in stage 10.0 (TID
>>> 113,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 0.0 in stage 10.0 (TID
>>> 107,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 2.0 in stage 10.0 (TID
>>> 109,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 5.0 in stage 10.0 (TID
>>> 112,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 7.0 in stage 10.0 (TID
>>> 114,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 1.0 in stage 10.0 (TID
>>> 108,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 WARN TaskSetManager: Lost task 4.0 in stage 10.0 (TID
>>> 111,
>>> localhost): ExecutorLostFailure (executor driver exited caused by one of
>>> the
>>> running tasks) Reason: Executor heartbeat timed out after 144662 ms
>>> 16/05/16 15:22:26 INFO TaskSchedulerImpl: Removed TaskSet 10.0, whose
>>> tasks
>>> have all completed, from pool
>>> 16/05/16 15:22:26 ERROR ActorSystemImpl: Uncaught fatal error from thread
>>> [sparkDriverActorSystem-scheduler-1] shutting down ActorSystem
>>> [sparkDriverActorSystem]
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>         at
>>> akka.dispatch.AbstractNodeQueue.<init>(AbstractNodeQueue.java:22)
>>>         at
>>>
>>> akka.actor.LightArrayRevolverScheduler$TaskQueue.<init>(Scheduler.scala:443)
>>>         at
>>>
>>> akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
>>>         at
>>> akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 16/05/16 15:22:27 INFO RemoteActorRefProvider$RemotingTerminator:
>>> Shutting
>>> down remote daemon.
>>> 16/05/16 15:22:27 INFO RemoteActorRefProvider$RemotingTerminator: Remote
>>> daemon shut down; proceeding with flushing remote transports.
>>> 16/05/16 15:22:27 WARN NettyRpcEnv: Ignored message: true
>>> 16/05/16 15:22:27 WARN NettyRpcEnv: Ignored message: true
>>> 16/05/16 15:22:27 ERROR SparkUncaughtExceptionHandler: Uncaught
>>> exception in
>>> thread Thread[Executor task launch worker-14,5,main]
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>         at java.lang.Double.valueOf(Double.java:519)
>>>         at scala.runtime.BoxesRunTime.boxToDouble(BoxesRunTime.java:84)
>>>         at
>>>
>>> org.apache.spark.sql.catalyst.expressions.MutableRow.setDouble(rows.scala:176)
>>>         at
>>>
>>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
>>> Source)
>>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>>         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>>         at
>>> scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
>>>         at
>>> org.spark-project.guava.collect.Ordering.leastOf(Ordering.java:665)
>>>         at
>>> org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1391)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$29.apply(RDD.scala:1388)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>>         at
>>>
>>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>>>         at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>>         at
>>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>>         at
>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>>         at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 16/05/16 15:22:27 INFO TaskSchedulerImpl: Cancelling stage 10
>>> 16/05/16 15:22:27 WARN SparkContext: Killing executors is only supported
>>> in
>>> coarse-grained mode
>>> 16/05/16 15:22:27 INFO DAGScheduler: ResultStage 10 (head at
>>> <ipython-input-13-f753ebdb6b0f>:13) failed in 667.824 s
>>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>>> master
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>>> 16/05/16 15:22:27 INFO DAGScheduler: Executor lost: driver (epoch 2)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Trying to remove
>>> executor
>>> driver from BlockManagerMaster.
>>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Removing block manager
>>> BlockManagerId(driver, localhost, 59407)
>>> 16/05/16 15:22:27 INFO DAGScheduler: Job 8 failed: head at
>>> <ipython-input-13-f753ebdb6b0f>:13, took 667.845630 s
>>> 16/05/16 15:22:27 ERROR BlockManager: Failed to report
>>> broadcast_15_piece0
>>> to master; giving up.
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Removed driver successfully in
>>> removeExecutor
>>> 16/05/16 15:22:27 INFO DAGScheduler: Host added was in lost list earlier:
>>> localhost
>>> 16/05/16 15:22:27 INFO SparkContext: Invoking stop() from shutdown hook
>>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>>> master
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 16/05/16 15:22:27 INFO BlockManagerMasterEndpoint: Registering block
>>> manager
>>> localhost:59407 with 51.5 GB RAM, BlockManagerId(driver, localhost,
>>> 59407)
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>>> memory
>>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>>> master
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>>> memory
>>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>>> master
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>>> memory
>>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>>> master
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>>> memory
>>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO Executor: Told to re-register on heartbeat
>>> 16/05/16 15:22:27 INFO BlockManager: BlockManager re-registering with
>>> master
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Trying to register
>>> BlockManager
>>> 16/05/16 15:22:27 INFO BlockManagerMaster: Registered BlockManager
>>> 16/05/16 15:22:27 INFO BlockManager: Reporting 8 blocks to the master.
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_7_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO SparkUI: Stopped Spark web UI at
>>> http://192.168.107.30:4040
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_15_piece0 in
>>> memory
>>> on localhost:59407 (size: 8.2 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_13_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:27 INFO BlockManagerInfo: Added broadcast_14_piece0 in
>>> memory
>>> on localhost:59407 (size: 19.3 KB, free: 51.5 GB)
>>> 16/05/16 15:22:56 INFO MapOutputTrackerMasterEndpoint:
>>> MapOutputTrackerMasterEndpoint stopped!
>>> 05-16 15:22:56.111 127.0.0.1:54321       2059   #e Thread WARN:
>>> Swapping!
>>> GC CALLBACK, (K/V:29.74 GB + POJO:15.20 GB + FREE:12.56 GB ==
>>> MEM_MAX:57.50
>>> GB), desiredKV=8.12 GB OOM!
>>> 16/05/16 15:22:56 INFO RemoteActorRefProvider$RemotingTerminator:
>>> Remoting
>>> shut down.
>>> 16/05/16 15:22:56 WARN NettyRpcEndpointRef: Error sending message
>>> [message =
>>> Heartbeat(driver,[Lscala.Tuple2;@797268e9,BlockManagerId(driver,
>>> localhost,
>>> 59407))] in 1 attempts
>>> org.apache.spark.SparkException: Could not find HeartbeatReceiver or it
>>> has
>>> been stopped.
>>>         at
>>> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:161)
>>>         at
>>>
>>> org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:126)
>>>         at
>>> org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:227)
>>>         at
>>> org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:511)
>>>         at
>>>
>>> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:100)
>>>         at
>>> org.apache.spark.executor.Executor.org
>>> $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:449)
>>>         at
>>>
>>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
>>>         at
>>>
>>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>>         at
>>>
>>> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>>>         at
>>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
>>>         at
>>> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470)
>>>         at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>         at
>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>         at
>>>
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>         at
>>>
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)"
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/GC-overhead-limit-exceeded-tp26966.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>


-- 
---
Takeshi Yamamuro

Mime
View raw message