jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-1312) Bundle nodes into a document
Date Thu, 21 Jul 2016 06:26:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387241#comment-15387241
] 

Chetan Mehrotra commented on OAK-1312:
--------------------------------------

h3. Usage and Configuration

Bundling definitions are defined in NodeStore as NodeState under '/jcr:system/documentstore/bundlor'.
The definition structure consist of

{noformat}
+ <node type name>
  - pattern - multi
{noformat}

So have node with name same as nodeType based on which bundling rules have to be applied.
That node needs to have a {{pattern}} multi value string property which
depfines path patterns which needs to be bundle. The path pattern can be

* exact - e.g. 'jcr:content' or 'jcr:content/metadata' - Indicates that child node jcr:content
needs to be bundled
* pattern - e.g. 'jcr:content/*' - Indicates that child node jcr:content and *all* its child
node needs to be bundled

{noformat}
jcr:system
  documentstore
    bundlor
      app:Asset{pattern = [jcr:content/metadata, jcr:content/renditions, jcr:content/renditions/**,
jcr:content]}
      nt:file{pattern = [jcr:content]}
{noformat} 

Above config defines pattern for nt:file and app:Asset

h3. Storage Format

As part of bundling any node which gets bundles is stored as relative property in root bundling
node.

For e.g. given a nt:file
{noformat}
+ book.jpg (nt:file)
  + jcr:content
     - jcr:data
{noformat}

And pattern
{noformat}
+ jcr:system/documentstore/bundlor
  + nt:file
    - pattern - [jcr:content]
{noformat}

Is stored as 
{code:javascript}
{
  "_id": "2:/test/book.jpg",
  "_modified": 1469080015,
  "_commitRoot": {"r1560bfe1650-0-1": "0"},
  "_deleted": {"r1560bfe1650-0-1": "false"},

  ":pattern": {"r1560bfe1650-0-1": "[\"str:jcr:content\"]"},

  "jcr:primaryType": {"r1560bfe1650-0-1": "\"nt:file\""}

  "jcr:content/:self": {"r1560bfe1650-0-1": "true"},
  "jcr:content/jcr:data": {"r1560bfe1650-0-1": "\"bar\""},
}
{code}

In above format
* {{:pattern}} - Special property which stores the pattern used at time of bundling
* {{jcr:content/:self}} - This is a marker property to record that jcr:content node is bundled
* {{jcr:content/jcr:data}} - Property at book.jpg/jcr:content/@jcr:data stored as relative
property

The bundling format is captured at time of addition i.e. creation of nodes and then reused
for later changes. So if the pattern definition gets changed later it would not impact already
bundled nodes. At time of deletion all such relative properties would be set to null

Another example

For e.g. given a app:Asset
{noformat}
/content//banner.png
  - jcr:primaryType = "app:Asset"
  + jcr:content
    - jcr:primaryType = "app:AssetContent"
    + metadata
      - status = "published"
      + xmp
        + 1
          - softwareAgent = "Adobe Photoshop"
          - author = "David"
    + renditions (nt:folder)
      + original (nt:file)
        + jcr:content
          - jcr:data = ...
    + comments (nt:folder)
{noformat}

And pattern
{noformat}
+ jcr:system/documentstore/bundlor
  + app:Asset
    - pattern - [jcr:content/metadata, jcr:content/renditions, jcr:content/renditions/**,
jcr:content]
{noformat}

Is stored as 
{code:javascript}
{
  
  "_children": true,
  "_modified": 1469081925,
  "_id": "2:/test/book.jpg",
  "_commitRoot": {"r1560c1b3db8-0-1": "0"},
  "_deleted": {"r1560c1b3db8-0-1": "false"},

  ":pattern": {
    "r1560c1b3db8-0-1": "[\"str:jcr:content/metadata\",\"str:jcr:content/renditions\",\"str:jcr:content/renditions/**\",\"str:jcr:content\"]"
  },

  
  "jcr:primaryType": {"r1560c1b3db8-0-1": "\"str:app:Asset\""},

  //Relative node jcr:content
  "jcr:content/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:oak:Unstructured\""},

  //Relative node jcr:content/metadata
  "jcr:content/metadata/:self": {"r1560c1b3db8-0-1": "true" },
  "jcr:content/metadata/status": {"r1560c1b3db8-0-1": "\"published\""},
  "jcr:content/metadata/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:oak:Unstructured\""},
  
  //Relative node jcr:content/renditions
  "jcr:content/renditions/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/renditions/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:folder\""},

  //Relative node jcr:content/renditions/original
  "jcr:content/renditions/original/:self": {"r1560c1b3db8-0-1": "true"}
  "jcr:content/renditions/original/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:file\""},

  //Relative node jcr:content/renditions/original/jcr:content
  "jcr:content/renditions/original/jcr:content/:self": {"r1560c1b3db8-0-1": "true"},
  "jcr:content/renditions/original/jcr:content/jcr:primaryType": {"r1560c1b3db8-0-1": "\"nam:nt:resource\""},
  "jcr:content/renditions/original/jcr:content/jcr:data": {"r1560c1b3db8-0-1": "\"<data>\""},
}
{code}


> Bundle nodes into a document
> ----------------------------
>
>                 Key: OAK-1312
>                 URL: https://issues.apache.org/jira/browse/OAK-1312
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, documentmk
>            Reporter: Marcel Reutegger
>            Assignee: Chetan Mehrotra
>              Labels: performance
>             Fix For: 1.6
>
>
> For very fine grained content with many nodes and only few properties per node it would
be more efficient to bundle multiple nodes into a single MongoDB document. Mostly reading
would benefit because there are less roundtrips to the backend. At the same time storage footprint
would be lower because metadata overhead is per document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message