spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Commented] (SPARK-24912) Broadcast join OutOfMemory stack trace obscures actual cause of OOM
Date Wed, 01 Aug 2018 22:33:00 GMT


Apache Spark commented on SPARK-24912:

User 'bersprockets' has created a pull request for this issue:

> Broadcast join OutOfMemory stack trace obscures actual cause of OOM
> -------------------------------------------------------------------
>                 Key: SPARK-24912
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Bruce Robbins
>            Priority: Minor
> When the Spark driver suffers an OutOfMemoryError while attempting to broadcast a table
for a broadcast join, the resulting stack trace obscures the actual cause of the OOM. For
> {noformat}
> [GC (Allocation Failure)  585453K->585453K(928768K), 0.0060025 secs]
> [Full GC (Allocation Failure)  585453K->582524K(928768K), 0.4019639 secs]
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to java_pid12446.hprof ...
> Heap dump file created [632701033 bytes in 1.016 secs]
> Exception in thread "main" java.lang.OutOfMemoryError: Not enough memory to build and
broadcast the table to all worker nodes. As a workaround, you can either disable broadcast
by setting spark.sql.autoBroadcastJoinThreshold to -1 or increase the spark driver memory
by setting spark.driver.memory to a higher value
> 	at$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.scala:122)
> 	at$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.scala:76)
> 	at org.apache.spark.sql.execution.SQLExecution$$anonfun$withExecutionId$1.apply(SQLExecution.scala:101)
> 	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
> 	at org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:98)
> 	at$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:75)
> 	at$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:75)
> 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> 	at scala.concurrent.impl.Future$
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(
> 	at java.util.concurrent.ThreadPoolExecutor$
> 	at
> 18/07/24 14:29:58 INFO ContextCleaner: Cleaned accumulator 30
> 18/07/24 14:29:58 INFO ContextCleaner: Cleaned accumulator 35
> {noformat}
> The above stack trace blames BroadcastExchangeExec. However, the given line is actually
where the original OutOfMemoryError was caught and a new one was created and wrapped by a
SparkException. The actual location where the OOM occurred was in LongToUnsafeRowMap#grow,
at this line:
> {noformat}
> val newPage = new Array[Long](newNumWords.toInt)
> {noformat}
> Sometimes it is helpful to know the actual location from which an OOM is thrown. In the
above case, the location indicated that Spark underestimated the size of a large-ish table
and ran out of memory trying to load it into memory.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message