jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "angela (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-3933) potential improvements to membership storage
Date Tue, 01 Mar 2016 13:45:18 GMT

    [ https://issues.apache.org/jira/browse/OAK-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173770#comment-15173770

angela commented on OAK-3933:

additional ideas:
- reconsider the current threshold of 100 entries per {{rep:members}} property (=> benchmarks
needed to find the optimal value)
- use {{PropertyState.count()}} to identify the best property to append the new value(s) (=>
this might require some improvements to the {{DocumentPropertyState}} which needs to parse
the property in order to determine the number of values afaik).
- keep track of the newest ref-list-node and only append there (=> ignoring the fact that
the count of some member-properties might shrink due to member-removal)

> potential improvements to membership storage
> --------------------------------------------
>                 Key: OAK-3933
>                 URL: https://issues.apache.org/jira/browse/OAK-3933
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>            Reporter: Julian Reschke
> Membership information of a group currently is stored as an unsorted set of IDs, spread
over multiple child nodes, using multivalued properties (of 1000 each).
> This content structure makes certain operations relatively slow:
> 1) Checking for declared membership
> When the authorizable to be checked is not a member, all child nodes need to be read
and examined (in the other case, checking stops when a match is found).
> 2) Checking for inherited membership
> The membership IDs do not reveal the type of authorizable. In order to check inherited
membership as well, the authorizable with the given ID needs to be read from storage in order
to check the type.
> Below are a few ideas how this might be improved (however, the change of structure would
require a mgiration step).
> 1) Avoid having to read all child nodes to check declared membership
> Assuming an alphanumeric ID structure, this could be achieved my modifying the structure
like that:
> - as before, start with a single node
> - when a new member needs to be inserted and the candidate node is already full (has
1000 entries), create a new child node named after the first character of the authorizable
> - when this "level 1" member is full, start using "level 2" members and so on
> (assuming the ID structure is suitable for that, otherwise a different hash could be
> To check for membership, we wouldn't need to read *all* child nodes, but only those where
the node name is a prefix match of the ID.
> 2) Avoid having to instantiate authorizables for declared membership checks
> - put limited type information into the stored IDs, such as "u" and "g" prefixes; that
way the code could identify authorizables that are users and avoid having to instantiate them
> (this assumes that an ID that refers to a user will never refer to a group in the future)

This message was sent by Atlassian JIRA

View raw message