[ https://issues.apache.org/jira/browse/OAK-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608112#comment-15608112 ] Tomek Rękawek commented on OAK-4882: ------------------------------------ [~mmarth]: bq. But the way you describe it, it seems that the file maintenance is actually the problem (or rather the fact that maintenance blocks put operations). I thought that the pers-cache is a) append-only and b) uses rotating files. If so, it seems possible to keep adding entries even while files get rotated or old files removed... That is my understanding of the initial cause of the OAK-2761. [~tmueller], could you elaborate on the cause of hangs described in OAK-2761? The whole async queue idea was introduced to fix this very issue, so if it can be fixed on the persistence layer level, it'd be even better. In the meantime, I'll carry on with the async queue work. It's optional (the async queue is disabled by default and can be enabled via system property), so we can try to tackle the issue from two sides. [~chetanm]: bq. We can improve a bit here by applying some heuristics Thanks for the ideas. I submitted a draft patch. The heuristics have been put into {{NodeCache#qualifiesToPersist()}}. > Bottleneck in the asynchronous persistent cache > ----------------------------------------------- > > Key: OAK-4882 > URL: https://issues.apache.org/jira/browse/OAK-4882 > Project: Jackrabbit Oak > Issue Type: Bug > Components: cache, documentmk > Affects Versions: 1.5.10, 1.4.8 > Reporter: Tomek Rękawek > Assignee: Tomek Rękawek > Fix For: 1.6 > > Attachments: OAK-4882.patch > > > The class responsible for accepting new cache operations which will be handled asynchronously is [CacheActionDispatcher|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/persistentCache/async/CacheActionDispatcher.java]. In case of a high load, when the queue is full (=1024 entries), the [add()|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/persistentCache/async/CacheActionDispatcher.java#L86] method removes the oldest 256 entries. However, we can't afford losing the updates (as it may result in having stale entries in the cache), so all the removed entries are compacted into one big invalidate action. > The compaction action ([CacheActionDispatcher#cleanTheQueue|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/persistentCache/async/CacheActionDispatcher.java#L97]) still holds the lock taken in add() method, so threads which tries to add something to the queue have to wait until cleanTheQueue() ends. > Maybe we can optimise the CacheActionDispatcher#add->cleanTheQueue part, so it won't hold the lock for the whole time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)