nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleksi Derkatch <oderka...@vitalimages.com>
Subject Re: ForkJoinPool.commonPool() in Nifi
Date Wed, 28 Mar 2018 13:29:17 GMT
Anyone have any thoughts on this? Should I make a JIRA ticket?

________________________________
From: Oleksi Derkatch <oderkatch@vitalimages.com>
Sent: Thursday, March 8, 2018 4:36:51 PM
To: dev@nifi.apache.org
Subject: ForkJoinPool.commonPool() in Nifi

Hi,


A few weeks ago we encountered a problem with one of our custom processors which seemed to
deadlock all processing in Nifi under load. We believe the issue is that our processors were
relying on ForkJoinPool.commonPool, but so was the Nifi engine during it's scheduling (both
via CompletableFuture). As such, when we did a thread dump, we saw something like this:


"ForkJoinPool.commonPool-worker-6" #381 daemon prio=5 os_prio=0 tid=0x00007f300d934000 nid=0x4be4
waiting on condition [0x00007f2fd53e7000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006c8b00568> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3313)
    at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
    at customcode(customcode.java:83)
    at customcode.lambda$null$23(customcode.java:320)
    at customcode$$Lambda$261/442205945.call(Unknown Source)
    at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
    - locked <0x00000006c8b007f0> (a com.google.common.cache.LocalCache$StrongWriteEntry)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3932)
    at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721)
    at customcode.lambda$customethod$24(customcode.java:309)
    at customcode$$Lambda$258/1540137328.get(Unknown Source)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
    at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)


I think what happened here is that since we were both using ForkJoinPool.commonPool() like
this, we actually ran out of threads and deadlocked. We were waiting (in the nifi processor)
on a future that was also submitted to the same commonPool at the time of the deadlock. The
solution was for us to use a dedicated thread pool instead of a shared one.


It might be worth considering changing this in Nifi for the future, in case other custom processors
use this pattern as well.


This also brings up another question. By default, the size of this thread pool is (# of CPUs
- 1). How does that affect processing when we set the maximum number of threads in the Nifi
UI to be much higher than that? Shouldn't this thread pool be configured for the same size?
This is tunable with the -Djava.util.concurrent.ForkJoinPool.common.parallelism java flag
(which we also adjusted in troubleshooting this).










Notice - Confidential Information The information in this communication and any attachments
is strictly confidential and intended only for the use of the individual(s) or entity(ies)
named above. If you are not the intended recipient, any dissemination, distribution, copying
or other use of the information contained in this communication and/or any attachment is strictly
prohibited. If you have received this communication in error, please first notify the sender
immediately and then delete this communication from all data storage devices and destroy all
hard copies.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message