[ https://issues.apache.org/jira/browse/SPARK-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341962#comment-14341962
]
Aaron Davidson commented on SPARK-3889:
---------------------------------------
This may be a new issue. I would open a new ticket, especially because the "ConnectionManager
failed ACK" thing shouldn't be happening in 1.2.1; there should be different symptoms and
perhaps a different cause as well.
A last ditch thing to try, by the way, is to up spark.storage.memoryMapThreshold to a very
large number (e.g., 1 GB in bytes) and see if it still occurs -- if so, then please report
more details about your workload and any other possible symptoms you see.
> JVM dies with SIGBUS, resulting in ConnectionManager failed ACK
> ---------------------------------------------------------------
>
> Key: SPARK-3889
> URL: https://issues.apache.org/jira/browse/SPARK-3889
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.2.0
> Reporter: Aaron Davidson
> Assignee: Aaron Davidson
> Priority: Critical
> Fix For: 1.2.0
>
>
> Here's the first part of the core dump, possibly caused by a job which shuffles a lot
of very small partitions.
> {code}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGBUS (0x7) at pc=0x00007fa5885fcdb0, pid=488, tid=140343502632704
> #
> # JRE version: 7.0_25-b30
> # Java VM: OpenJDK 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops)
> # Problematic frame:
> # v ~StubRoutines::jbyte_disjoint_arraycopy
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try
"ulimit -c unlimited" before starting Java again
> #
> # If you would like to submit a bug report, please include
> # instructions on how to reproduce the bug and visit:
> # https://bugs.launchpad.net/ubuntu/+source/openjdk-7/
> #
> --------------- T H R E A D ---------------
> Current thread (0x00007fa4b0631000): JavaThread "Executor task launch worker-170" daemon
[_thread_in_Java, id=6783, stack(0x00007fa4448ef000,0x00007fa4449f0000)]
> siginfo:si_signo=SIGBUS: si_errno=0, si_code=2 (BUS_ADRERR), si_addr=0x00007fa428f79000
> {code}
> Here is the only useful content I can find related to JVM and SIGBUS from Google: https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=976664
> It appears it may be related to disposing byte buffers, which we do in the ConnectionManager
-- we mmap shuffle files via ManagedBuffer and dispose of them in BufferMessage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org
|