spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <>
Subject [jira] [Commented] (SPARK-6449) Driver OOM results in reported application result SUCCESS
Date Mon, 23 Mar 2015 13:56:11 GMT


Thomas Graves commented on SPARK-6449:

[~rdub] Was there an exception in the log higher up? Wondering if it shows the entire exception
for the out of memory.

> Driver OOM results in reported application result SUCCESS
> ---------------------------------------------------------
>                 Key: SPARK-6449
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.3.0
>            Reporter: Ryan Williams
> I ran a job yesterday that according to the History Server and YARN RM finished with
status {{SUCCESS}}.
> Clicking around on the history server UI, there were too few stages run, and I couldn't
figure out why that would have been.
> Finally, inspecting the end of the driver's logs, I saw:
> {code}
> 15/03/20 15:08:13 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
> 15/03/20 15:08:13 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down
remote daemon.
> 15/03/20 15:08:13 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon
shut down; proceeding with flushing remote transports.
> 15/03/20 15:08:13 INFO spark.SparkContext: Successfully stopped SparkContext
> Exception in thread "Driver" scala.MatchError: java.lang.OutOfMemoryError: GC overhead
limit exceeded (of class java.lang.OutOfMemoryError)
>         at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$
> 15/03/20 15:08:13 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode:
0, (reason: Shutdown hook called before final status was reported.)
> 15/03/20 15:08:13 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
(diag message: Shutdown hook called before final status was reported.)
> 15/03/20 15:08:13 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut
> 15/03/20 15:08:13 INFO impl.AMRMClientImpl: Waiting for application to be successfully
> 15/03/20 15:08:13 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1426705269584_0055
> {code}
> The driver OOM'd, [the {{catch}} block that presumably should have caught it|]
threw a {{MatchError}}, and then {{SUCCESS}} was returned to YARN and written to the event
> This should be logged as a failed job and reported as such to YARN.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message