spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viacheslav Tradunsky (Jira)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-29244) ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of memory blocks
Date Tue, 01 Oct 2019 20:28:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-29244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942295#comment-16942295
] 

Viacheslav Tradunsky commented on SPARK-29244:
----------------------------------------------

Thank you! 

> ArrayIndexOutOfBoundsException on TaskCompletionListener during releasing of memory blocks
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-29244
>                 URL: https://issues.apache.org/jira/browse/SPARK-29244
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.0
>         Environment: Release label:emr-5.20.0
> Hadoop distribution:Amazon 2.8.5
> Applications:Livy 0.5.0, Spark 2.4.0
>            Reporter: Viacheslav Tradunsky
>            Assignee: L. C. Hsieh
>            Priority: Major
>             Fix For: 2.4.5, 3.0.0
>
>         Attachments: executor_oom.txt
>
>
> At the end of task completion an exception happened:
> {code:java}
> 19/09/25 09:03:58 ERROR TaskContextImpl: Error in TaskCompletionListener19/09/25 09:03:58
ERROR TaskContextImpl: Error in TaskCompletionListenerjava.lang.ArrayIndexOutOfBoundsException:
-3 at org.apache.spark.memory.TaskMemoryManager.freePage(TaskMemoryManager.java:333) at org.apache.spark.memory.MemoryConsumer.freePage(MemoryConsumer.java:130)
at org.apache.spark.memory.MemoryConsumer.freeArray(MemoryConsumer.java:108) at org.apache.spark.unsafe.map.BytesToBytesMap.free(BytesToBytesMap.java:803)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.free(UnsafeFixedWidthAggregationMap.java:225)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.lambda$new$0(UnsafeFixedWidthAggregationMap.java:111)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Important to note, that before this one, there was OOM of allocating some pages. It looks
like everything related to each other, but on OOM the whole flow goes abnormally, so no resources
are fried correctly.
> {code:java}
> java.lang.NullPointerExceptionjava.lang.NullPointerException at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.getMemoryUsage(UnsafeInMemorySorter.java:208)
at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.getMemoryUsage(UnsafeExternalSorter.java:249)
at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.updatePeakMemoryUsed(UnsafeExternalSorter.java:253)
at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.freeMemory(UnsafeExternalSorter.java:296)
at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.cleanupResources(UnsafeExternalSorter.java:328)
at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.lambda$new$0(UnsafeExternalSorter.java:178)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> This is must be something with job planning, but taking so many exceptions into account
doesn't make things easier. Would be happy to provide more details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message