jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Richard (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-3154) Improve SimpleNodeAggregator performance with a NodeState cache
Date Mon, 03 Aug 2015 13:07:04 GMT

    [ https://issues.apache.org/jira/browse/OAK-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651837#comment-14651837
] 

Joel Richard commented on OAK-3154:
-----------------------------------

The following methods could benefit from a similar cache:
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.IndexTaskSpliter#split
org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate#updateIndex 
org.apache.jackrabbit.oak.plugins.observation.NodeObserver#contentChanged

However, I think it would make more sense to add a last child node cache in one of the following
classes: SecureNodeBuilder, MemoryNodeBuilder or SegmentNodeState. I will describe my idea
in OAK-2758 which contains already a similar suggestions.

Closing this issue now because SimpleNodeAggregator won't be used much longer.

> Improve SimpleNodeAggregator performance with a NodeState cache
> ---------------------------------------------------------------
>
>                 Key: OAK-3154
>                 URL: https://issues.apache.org/jira/browse/OAK-3154
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 1.3.3
>            Reporter: Joel Richard
>              Labels: performance
>
> I have profiled a query where 16% of the query fetching time is spent inside of SimpleNodeAggregator.isNodeType.
In my case, a lot of nodes which are read have overlapping paths.
> Because the nodes seem to be iterated alphabetically, it would be possible to cache the
previous NodeState chain in an array and reuse as much as possible if the previous and current
path overlap. This would significantly reduce the query fetching time in cases where a lot
of paths are similar. Since the NodeState cache array can be reused for the whole query execution,
the possible overhead of it should be negligible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message