jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-1312) Bundle nodes into a document
Date Thu, 21 Jul 2016 09:59:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387459#comment-15387459

Chetan Mehrotra commented on OAK-1312:

[~mmarth] Would need to reword the description. There are multiple aspects

# Reduced latency for traversal - If you have an structure like aap:Asset and traversal is
done it would involve lots of queries for child nodes. With bundling all those queries are
avoided. When I say query it for list of child node compared to "find" which is specific to
particular node
# Reduced actual number of Documents in persistent store - Currently for a nodetype like app:Asset
where 1 app:Asset = 20 JCR Nodes. If we have 10 M assets then we would be consuming 200 M
documents in Mongo. If with bundling we can reduce this ratio to say 1-5 then it would reduce
actual number of documents in Mongo. Lesser number of documents means lesser size for _id
and {_modified, _id} index

Whats negative impact. Well one needs to be careful with bundling pattern. If you cover too
many nodes then you would get bulky Documents also which would impact performance. Also a
misfired pattern might cover a folder like structure which is unbounded and then can hit limit
of a Mongo Document size

It would directly benefit structure like nt:file where in most cases an extra call is required
to fetch the jcr:content node. In all key thing would be to have right bundling pattern

bq. So, what I wonder: do you have specific test cases in mind to evaluate the effect of this

#2 is a logical benefit i.e. if there are lesser number of Documents then indexes are light
and Mongo is happy. For impact on general read and write would like to run some test around
such app:Asset structure and see whats the impact on throughput

> Bundle nodes into a document
> ----------------------------
>                 Key: OAK-1312
>                 URL: https://issues.apache.org/jira/browse/OAK-1312
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, documentmk
>            Reporter: Marcel Reutegger
>            Assignee: Chetan Mehrotra
>              Labels: performance
>             Fix For: 1.6
> For very fine grained content with many nodes and only few properties per node it would
be more efficient to bundle multiple nodes into a single MongoDB document. Mostly reading
would benefit because there are less roundtrips to the backend. At the same time storage footprint
would be lower because metadata overhead is per document.
> Feature branch - https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312

This message was sent by Atlassian JIRA

View raw message