jackrabbit-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mreut...@apache.org
Subject svn commit: r1835390 [5/23] - in /jackrabbit/site/live/oak/docs: ./ architecture/ coldstandby/ features/ nodestore/ nodestore/document/ nodestore/segment/ oak-mongo-js/ oak_api/ plugins/ query/ security/ security/accesscontrol/ security/authentication/...
Date Mon, 09 Jul 2018 08:53:19 GMT
Modified: jackrabbit/site/live/oak/docs/nodestore/documentmk.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/documentmk.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/documentmk.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/documentmk.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Oak Document Storage</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -241,116 +241,92 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><h1>Oak Document Storage</h1>
-
+  -->
+<h1>Oak Document Storage</h1>
 <ul>
-  
+
 <li><a href="#oak-document-storage">Oak Document Storage</a>
-  
 <ul>
-    
+
+<li><a href="#new-1.10">New in 1.10</a></li>
 <li><a href="#new-1.8">New in 1.8</a></li>
-    
 <li><a href="#new-1.6">New in 1.6</a></li>
-    
 <li><a href="#backend-implementations">Backend implementations</a></li>
-    
 <li><a href="#content-model">Content Model</a></li>
-    
 <li><a href="#node-content-model">Node Content Model</a></li>
-    
 <li><a href="#revisions">Revisions</a></li>
-    
 <li><a href="#clock-requirements">Clock requirements</a></li>
-    
 <li><a href="#branches">Branches</a></li>
-    
 <li><a href="#previous-documents">Previous Documents</a></li>
-    
 <li><a href="#sweep-revision">Sweep Revision</a></li>
-    
 <li><a href="#node-bundling">Node Bundling</a></li>
-    
 <li><a href="#background-operations">Background Operations</a>
-    
 <ul>
-      
+
 <li><a href="#renew-cluster-id-lease">Renew Cluster Id Lease</a></li>
-      
 <li><a href="#background-document-split">Background Document Split</a></li>
-      
 <li><a href="#background-writes">Background Writes</a></li>
-      
 <li><a href="#bg-read">Background Reads</a></li>
-    </ul></li>
-    
-<li><a href="#pending-topics">Pending Topics</a>
-    
-<ul>
-      
-<li><a href="#conflict-detection-and-handling">Conflict Detection and Handling</a></li>
-    </ul></li>
-    
+</ul>
+</li>
+<li><a href="#metrics">Metrics and Monitoring</a></li>
 <li><a href="#cluster-node-metadata">Cluster Node Metadata</a>
-    
 <ul>
-      
-<li><a href="#rw-preference">Specifying the Read Preference and Write Concern</a>
-      
-<ul>
-        
-<li><a href="#via-configuration">Via Configuration</a></li>
-        
-<li><a href="#changing-at-runtime">Changing at Runtime</a></li>
-      </ul></li>
-    </ul></li>
-    
+
+<li><a href="#acquire-a-cluster-node-id">Acquire a Cluster Node ID</a></li>
+<li><a href="#recovery-for-a-cluster-node-id">Recovery for a Cluster Node ID</a></li>
+<li><a href="#rw-preference">Specifying the Read Preference and Write Concern</a></li>
+</ul>
+</li>
 <li><a href="#cache">Caching</a>
-    
 <ul>
-      
+
 <li><a href="#cache-invalidation">Cache Invalidation</a></li>
-      
 <li><a href="#cache-configuration">Cache Configuration</a></li>
-    </ul></li>
-    
+</ul>
+</li>
 <li><a href="#unlockUpgrade">Unlock upgrade</a></li>
-    
 <li><a href="#revision-gc">Revision Garbage Collection</a></li>
-  </ul></li>
+<li><a href="#pending-topics">Pending Topics</a>
+<ul>
+
+<li><a href="#conflict-detection-and-handling">Conflict Detection and Handling</a></li>
+</ul>
+</li>
+</ul>
+</li>
 </ul>
 <p>One of the plugins in Oak stores data in a document oriented format. The plugin implements the low level <tt>NodeStore</tt> interface.</p>
 <p>The document storage optionally uses the <a href="persistent-cache.html">persistent cache</a> to reduce read operations on the backend storage.</p>
 <div class="section">
-<h2><a name="New_in_1.8"></a><a name="new-1.8"></a> New in 1.8</h2>
+<h2><a name="New_in_1.10"></a><a name="new-1.10"></a> New in 1.10</h2>
+<ul>
 
+<li>Use of MongoDB client sessions. See also <a href="document/mongo-document-store.html#read-preference">read preference</a>.</li>
+<li><a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7316">Greedy cluster node info</a>. See also <a href="#acquire-a-cluster-node-id">Acquire a Cluster Node ID</a>.</li>
+</ul></div>
+<div class="section">
+<h2><a name="New_in_1.8"></a><a name="new-1.8"></a> New in 1.8</h2>
 <ul>
-  
+
 <li><a href="#sweep-revision">Sweep Revision</a></li>
-  
 <li><a href="#unlockUpgrade">Unlock upgrade</a></li>
-  
 <li><a href="#revision-gc">Continuous and oak-run triggered Revision GC</a></li>
 </ul></div>
 <div class="section">
 <h2><a name="New_in_1.6"></a><a name="new-1.6"></a> New in 1.6</h2>
-
 <ul>
-  
+
 <li><a href="#node-bundling">Node Bundling</a></li>
-  
 <li><a href="#secondary-store">Secondary Store</a></li>
 </ul></div>
 <div class="section">
 <h2><a name="Backend_implementations"></a><a name="backend-implementations"></a> Backend implementations</h2>
 <p>The DocumentNodeStore supports a number of backends, with a storage abstraction called <tt>DocumentStore</tt>:</p>
-
 <ul>
-  
+
 <li><a href="document/mongo-document-store.html"><tt>MongoDocumentStore</tt></a>: stores documents in a MongoDB.</li>
-  
 <li><tt>RDBDocumentStore</tt>: stores documents in a relational data base.</li>
-  
 <li><tt>MemoryDocumentStore</tt>: keeps documents in memory. This implementation should only be used for testing purposes.</li>
 </ul>
 <p>The remaining part of the document will focus on the <tt>MongoDocumentStore</tt> to explain and illustrate concepts of the DocumentNodeStore.</p></div>
@@ -361,21 +337,24 @@
 <p>Cluster wide information is stored in the <tt>settings</tt> collection. This includes checkpoints, journal and revision GC status, format version and the current cluster view.</p>
 <p>The data can be viewed using the MongoDB shell:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">&gt; show collections
+<div>
+<div>
+<pre class="source">&gt; show collections
 blobs
 clusterNodes
 journal
 nodes
 settings
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h2><a name="Node_Content_Model"></a><a name="node-content-model"></a> Node Content Model</h2>
 <p>The <tt>DocumentNodeStore</tt> stores each node in a separate MongoDB document and updates to a node are stored by adding new revision/value pairs to the document. This way the previous state of a node is preserved and can still be retrieved by a session looking at a given snapshot (revision) of the repository.</p>
 <p>The basic MongoDB document of a node in Oak looks like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_id&quot; : &quot;1:/node&quot;,
     &quot;_deleted&quot; : {
         &quot;r13f3875b5d1-0-1&quot; : &quot;false&quot;
@@ -391,6 +370,7 @@ settings
     }
 }
 </pre></div></div>
+
 <p>All fields in the above document are metadata and are not exposed through the Oak API. The DocumentNodeStore has two types of fields. Simple fields are key/value pairs like the <tt>_id</tt> or <tt>_modified</tt> field. Versioned fields are kept in sub-documents where the key is a revision paired with the value at this revision.</p>
 <p>The <tt>_id</tt> field is used as the primary key and consists of a combination of the depth of the path and the path itself. This is an optimization to align sibling keys in the index.</p>
 <p>The <tt>_deleted</tt> sub-document contains the revision this node was created in. In the above example the root node was created in revision <tt>r13f3875b5d1-0-1</tt>. If the node is later deleted, the <tt>_deleted</tt> sub-document will get a new field with the revision the node was deleted in.</p>
@@ -401,8 +381,9 @@ settings
 <p>Finally, the <tt>_revisions</tt> sub-document contains commit information about changes marked with a revision. E.g. the single entry in the above document tells us that everything marked with revision <tt>r13f3875b5d1-0-1</tt> is committed and therefore valid. In case the change is done in a branch then the value would be the base revision. It is only added for those nodes which happen to be the commit root for any given commit.</p>
 <p>Adding a property <tt>prop</tt> with value <tt>foo</tt> to the node in a next step will result in the following document:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13f3875b5d1-0-1&quot; : &quot;false&quot;
     },
@@ -421,11 +402,13 @@ settings
     }
 }
 </pre></div></div>
+
 <p>Now the document contains a new sub-document with the name of the new property. The value of the property is annotated with the revision the property was set. With each successful commit to this node, a new field is added to the <tt>_revisions</tt> sub-document. Similarly the <tt>_lastRev</tt> sub-document and <tt>_modified</tt> field are updated.</p>
 <p>After the node is deleted the document looks like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13f3875b5d1-0-1&quot; : &quot;false&quot;,
         &quot;r13f38835063-2-1&quot; : &quot;true&quot;
@@ -447,18 +430,16 @@ settings
     }
 }
 </pre></div></div>
+
 <p>The <tt>_deleted</tt> sub-document now contains a <tt>r13f38835063-2-1</tt> field marking the node as deleted in this revision.</p>
 <p>Reading the node in previous revisions is still possible, even if it is now marked as deleted as of revision <tt>r13f38835063-2-1</tt>.</p></div>
 <div class="section">
 <h2><a name="Revisions"></a><a name="revisions"></a> Revisions</h2>
 <p>As seen in the examples above, a revision is a String and may look like this: <tt>r13f38835063-2-1</tt>. It consists of three parts:</p>
-
 <ul>
-  
+
 <li>A timestamp derived from the system time of the machine it was generated on: <tt>13f38835063</tt></li>
-  
 <li>A counter to distinguish revisions created with the same timestamp: <tt>-2</tt></li>
-  
 <li>The cluster node id where this revision was created: <tt>-1</tt></li>
 </ul></div>
 <div class="section">
@@ -469,8 +450,9 @@ settings
 <p>The DocumentNodeStore implementation support branches, which allows a client to stage multiple commits and make them visible with a single merge call. A branch commit looks very similar to a regular commit, but instead of setting the value of an entry in <tt>_revisions</tt> to <tt>c</tt> (committed), it marks it with the base revision of the branch commit. In contrast to regular commits where the commit root is the common ancestor of all nodes modified in a commit, the commit root of a branch commit is always the root node. This is because a branch will likely have multiple commits and a commit root must already be known when the first commit happens on a branch. To make sure the following branch commits can use the same commit root, the DocumentNodeStore simply picks the root node, which always works in this case.</p>
 <p>A root node may look like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13fcda88ac0-0-1&quot; : &quot;false&quot;,
     },
@@ -489,10 +471,12 @@ settings
     }
 }
 </pre></div></div>
+
 <p>The root node was created in revision <tt>r13fcda88ac0-0-1</tt> and later in revision <tt>r13fcda91720-0-1</tt> property <tt>prop</tt> was set to <tt>foo</tt>. To keep the example simple, we now assume a branch is created based on the revision the root node was last modified and a branch commit is done to modify the existing property. After the branch commit the root node looks like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13fcda88ac0-0-1&quot; : &quot;false&quot;,
     },
@@ -516,12 +500,14 @@ settings
     }
 }
 </pre></div></div>
+
 <p>Note, the <tt>_bc</tt> sub-document was introduced with Oak 1.8 and is not present in older versions. The branch commit revision is added to <tt>_bc</tt> whenever a change is done on a document with a branch commit. This helps the DocumentNodeStore to more easily identify branch commit changes.</p>
 <p>At this point the modified property is only visible to a reader when it reads with the branch revision <tt>r13fcda919eb-0-1</tt> because the revision is marked with the base version of this commit in the <tt>_revisions</tt> sub-document. Note, the <tt>_lastRev</tt> is not updated for branch commits but only when a branch is merged.</p>
 <p>When the branch is later merged, the root node will look like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13fcda88ac0-0-1&quot; : &quot;false&quot;,
     },
@@ -545,14 +531,16 @@ settings
     }
 }
 </pre></div></div>
+
 <p>Now, the changed property is visible to readers with a revision equal or newer than <tt>r13fcda91b12-0-1</tt>.</p>
 <p>The same logic is used for changes to other nodes that belong to a branch commit. The DocumentNodeStore internally resolves the commit revision for a modification before it decides whether a reader is able to see a given change.</p></div>
 <div class="section">
 <h2><a name="Previous_Documents"></a><a name="previous-documents"></a> Previous Documents</h2>
 <p>Over time the size of a document grows because the DocumentNodeStore adds data to the document with every modification, but never deletes anything to keep the history. Old data is moved when there are 100 commits to be moved or the document is bigger than 1 MB. A document with a reference to old data looks like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13fcda88ac0-0-1&quot; : &quot;false&quot;,
     },
@@ -579,10 +567,12 @@ settings
     }
 }
 </pre></div></div>
+
 <p>The optional sub-document <tt>_prev</tt> contains a list of revision pairs, each indicating the range of commit revisions a previous document contains. In the above example there is one document with previous commits from <tt>r13fcda88ae0-0-1</tt> to <tt>r13fcda91710-0-1</tt>. The id of the previous document is derived from the upper bound of the range and the id/path of the current document. The id of the previous document for <tt>r13fcda88ae0-0-1</tt> and <tt>0:/</tt> is <tt>1:p/r13fcda88ae0-0-1</tt> and may looks like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_id&quot; : &quot;1:p/r13fcda88ae0-0-1&quot;,
     &quot;_modCount&quot; : NumberLong(1),
     &quot;_revisions&quot; : {
@@ -599,6 +589,7 @@ settings
     }
 }
 </pre></div></div>
+
 <p>Previous documents only contain immutable data, which means it only contains committed and merged <tt>_revisions</tt>. This also means the previous ranges of committed data may overlap because branch commits are not moved to previous documents until the branch is merged.</p></div>
 <div class="section">
 <h2><a name="Sweep_Revision"></a><a name="sweep-revision"></a> Sweep Revision</h2>
@@ -606,8 +597,9 @@ settings
 <p>With Oak 1.8 the concept of a sweep revision was introduced in the DocumentNodeStore. The sweep revision of a DocumentNodeStore indicates up to which revision non-branch changes are guaranteed to be committed. This allows to optimize read operations because a lookup of the commit root document can be avoided in most cases. It also means the Revision Garbage Collector can remove previous documents that contain <tt>_revisions</tt> entries if they are all older than the sweep revision.</p>
 <p>The sweep revision is maintained per DocumentNodeStore instance on the root document. Below is the root document already presented above, amended with the sweep revision.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
     &quot;_deleted&quot; : {
         &quot;r13fcda88ac0-0-1&quot; : &quot;false&quot;,
     },
@@ -628,7 +620,8 @@ settings
         &quot;r13fcda91720-0-1&quot; : &quot;\&quot;foo\&quot;&quot;
     }
 }
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h2><a name="node-bundling"></a> node-bundling</h2>
 <p><tt>@since Oak 1.6</tt></p>
@@ -649,87 +642,105 @@ settings
 <h3><a name="Background_Reads"></a><a name="bg-read"></a> Background Reads</h3>
 <p>The DocumentNodeStore periodically picks up changes from other DocumentNodeStore instances by polling the root node for changes of <tt>_lastRev</tt>. This happens once every second.</p></div></div>
 <div class="section">
-<h2><a name="Pending_Topics"></a><a name="pending-topics"></a> Pending Topics</h2>
-<div class="section">
-<h3><a name="Conflict_Detection_and_Handling"></a><a name="conflict-detection-and-handling"></a> Conflict Detection and Handling</h3></div></div>
+<h2><a name="Metrics_and_Monitoring"></a><a name="metrics"></a> Metrics and Monitoring</h2>
+<p>See <a href="document/metrics.html">DocumentNodeStore and DocumentStore metrics</a>.</p></div>
 <div class="section">
 <h2><a name="Cluster_Node_Metadata"></a><a name="cluster-node-metadata"></a> Cluster Node Metadata</h2>
-<p>Cluster node metadata is stored in the <tt>clusterNodes</tt> collection. There is one entry for each cluster node that is running, and there are entries for cluster nodes that were ran. Old entries are kept so that if a cluster node is started again, it gets the same cluster node id as before (which is not strictly needed for consistency, but nice for support, if one would want to find out which change originated from which cluster node).</p>
-<p>Each running cluster node updates the lease end time of the cluster node id every ten seconds, to ensure each cluster node uses a different cluster node id.</p>
+<p>Cluster node metadata is stored in the <tt>clusterNodes</tt> collection. There is one entry for each cluster node that is running, and there may be entries for cluster nodes that were running in the past. Old entries are kept so that if a cluster node is started again, it gets the same cluster node ID as before (which is not strictly needed for consistency, but nice for support, if one would want to find out which change originated from which cluster node). Starting with Oak 1.10, acquiring a cluster node ID changed slightly. A cluster node may now also acquire an inactive cluster node ID created by another cluster node.</p>
+<p>The entries of a <tt>clusterNodes</tt> collection may look like this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">&gt; db.clusterNodes.find().pretty()
+<div>
+<div>
+<pre class="source">&gt; db.clusterNodes.find().pretty()
 
 {
-    &quot;_id&quot; : &quot;1&quot;,
-    &quot;_modCount&quot; : NumberLong(2),
-    &quot;leaseEnd&quot; : NumberLong(&quot;1390465250135&quot;),
-    &quot;instance&quot; : &quot;/Users/test/jackrabbit/oak/trunk/oak-jcr&quot;,
-    &quot;machine&quot; : &quot;mac:20c9d043f141&quot;,
-    &quot;info&quot; : &quot;...pid: 11483, uuid: 6b6e8e4f-8322-4b19-a2b2-de0c573620b9 ...&quot;
+	&quot;_id&quot; : &quot;1&quot;,
+	&quot;_modCount&quot; : NumberLong(490),
+	&quot;state&quot; : &quot;ACTIVE&quot;,
+	&quot;leaseEnd&quot; : NumberLong(&quot;1390465250135&quot;),
+	&quot;instance&quot; : &quot;/home/oak&quot;,
+	&quot;machine&quot; : &quot;mac:20c9d043f141&quot;,
+	&quot;info&quot; : &quot;...pid: 983, uuid: 6b6e8e4f-8322-4b19-a2b2-de0c573620b9 ...&quot;
 }
 {
-    &quot;_id&quot; : &quot;2&quot;,
-    &quot;_modCount&quot; : NumberLong(2),
-    &quot;leaseEnd&quot; : NumberLong(&quot;1390465252206&quot;),
-    &quot;instance&quot; : &quot;/Users/mueller/jackrabbit/oak/trunk/oak-jcr&quot;,
-    &quot;machine&quot; : &quot;mac:20c9d043f141&quot;,
-    &quot;info&quot; : &quot;...pid: 11483, uuid: 28ada13d-ec9c-4d48-aeb9-cef53aa4bb9e ...&quot;
+	&quot;_id&quot; : &quot;2&quot;,
+	&quot;_modCount&quot; : NumberLong(492),
+	&quot;state&quot; : &quot;ACTIVE&quot;,
+	&quot;leaseEnd&quot; : NumberLong(&quot;1390465255192&quot;),
+	&quot;instance&quot; : &quot;/home/oak&quot;,
+	&quot;machine&quot; : &quot;mac:30c3d053f247&quot;,
+	&quot;info&quot; : &quot;...pid: 843, uuid: 28ada13d-ec9c-4d48-aeb9-cef53aa4bb9e ...&quot;
 }
 </pre></div></div>
-<p>The <tt>_id</tt> is the cluster node id of the node, which is the last part of the revision id. The <tt>leaseEnd</tt> is updated every ten seconds by running cluster nodes. It is the number of milliseconds since 1970. The <tt>instance</tt> is the current working directory. The <tt>machine</tt> is the lowest number of the network addresses, or a random uuid if this is not available. The <tt>info</tt> contains the same info as a string, plus additionally the process id and the uuid.</p>
+
+<p>In the above example, there are two active cluster nodes running with IDs <tt>1</tt> and <tt>2</tt>. The <tt>_id</tt> corresponds to the last part of the revisions generated by a cluster node. Please note, the <tt>_id</tt> representation is base 10, while the ID part of a revision is base 16! The <tt>instance</tt> is the current working directory and the <tt>machine</tt> is the lowest number of an active network adapter&#x2019;s MAC address. If no active network adapter is available, then the value for the <tt>machine</tt> field will be a random UUID. The <tt>info</tt> field contains the same info as a string, plus additional information like the process ID.</p>
+<p>Each running cluster node updates the <tt>leaseEnd</tt> time of the cluster node ID every ten seconds, to ensure each cluster node uses a different cluster node ID. The time is the number of milliseconds since 1970 and with every update is set two minutes ahead of the current time. This lease mechanism allows other cluster nodes to identify active, inactive and crashed cluster nodes.</p>
+<p>The diagram shows the different states a cluster node entry can be in.</p>
+<p><img src="document/cluster-node-lease.png" alt="Cluster node ID state diagram" /></p>
+<div class="section">
+<h3><a name="Acquire_a_cluster_node_ID"></a><a name="acquire-a-cluster-node-id"></a> Acquire a cluster node ID</h3>
+<p>There are different ways how a cluster node acquires an ID.</p>
+<p>In the most simple case there are no existing entries in the <tt>clusterNodes</tt> collection and the cluster node will create a new active entry with <tt>_id=&quot;1&quot;</tt>. The <tt>leaseEnd</tt> will already be set to a value higher than the current time. This entry is now considered active and in use. Similarly, when a second cluster node starts up, then it will create a new active entry with <tt>_id=&quot;2&quot;</tt> and so on for more cluster nodes.</p>
+<p>When a cluster node is shut down, the cluster node ID is released and put into the inactive state. This is reflected in the entry with a <tt>state</tt> and <tt>leaseEnd</tt> field set to <tt>null</tt>. On startup, the cluster node will re-acquire the same entry because the <tt>machine</tt> and <tt>instance</tt> field match its environment.</p>
+<p>Immediately restarting a crashed cluster node will lead to a somewhat delayed startup, because the cluster node will find a matching and active cluster node ID. In this case, the cluster node will wait until the lease expires (up to two minutes if the process crashed right after the lease was extended) and then run the recovery process for the cluster node ID. Depending on timing, the recovery may also be started by another active cluster node. In this case, the starting cluster node would wait up to one minute for the recovery to finish. Either way, if the recovery was successful, the cluster node ID will have transitioned to the inactive state and can be acquired again as described before.</p>
+<p>When a new cluster node is started and there is an inactive entry, then the cluster node will try to acquire it, even when its environment does not match the <tt>machine</tt> and <tt>instance</tt> fields. This behaviour is new and was introduced with Oak 1.10. Previous versions ignore entries that do not match the environment and would create a new entry.</p></div>
+<div class="section">
+<h3><a name="Recovery_for_a_cluster_node_ID"></a><a name="recovery-for-a-cluster-node-id"></a> Recovery for a cluster node ID</h3>
+<p>Recovery becomes necessary when the lease on a cluster node ID entry expires. This usually happens when the process that acquired the cluster node ID crashes, but the lease may also expire if the cluster node fails to extend the lease in time. In the latter case, the cluster node is obligated to stop any further operations on the document store. The current implementation does this by blocking operations on the document store level and stopping the oak-store-document bundle when it detects an outdated lease. Other active cluster nodes or the restarted cluster node are then responsible for running recovery for the relevant cluster node ID and setting the state back to inactive.</p>
+<p>Before a cluster node can run the recovery process, the recovery lock on the cluster node ID entry must be acquired. This lock again is protected with a lease to detect a crashed cluster node that was performing recovery and left behind a recovery lock. Other cluster nodes will therefore check whether the cluster node ID identified by <tt>recoveryBy</tt> is still active and try to break the recovery lock if the recovering cluster node is considered inactive or expired.</p>
+<p>There is a special case when a starting cluster node performs the recovery for itself. That is, for the cluster node ID it wants to acquire but first has to run recovery for it. In this case the lease is only updated once for the cluster node entry ID entry that needs recovery. This happens when the recovery lock is set on the entry. The starting cluster node then must finish the recovery within this initial lease deadline, otherwise the recovery will be considered failed and the starting cluster node will acquire a different (potentially new) ID. The failed recovery will then be performed later by a background job of one of the active cluster nodes.</p></div>
 <div class="section">
 <h3><a name="Specifying_the_Read_Preference_and_Write_Concern"></a><a name="rw-preference"></a> Specifying the Read Preference and Write Concern</h3>
 <p>See <a href="document/mongo-document-store.html#configuration">configuration</a> of a <tt>MongoDocumentStore</tt>.</p></div></div>
 <div class="section">
 <h2><a name="Caching"></a><a name="cache"></a> Caching</h2>
 <p><tt>DocumentNodeStore</tt> maintains multiple caches to provide an optimum performance. By default the cached instances are kept in heap memory but some of the caches can be backed by <a href="persistent-cache.html">persistent cache</a>.</p>
-
 <ol style="list-style-type: decimal">
-  
+
 <li>
+
 <p><tt>documentCache</tt> - Document cache is used for caching the <tt>NodeDocument</tt> instance. These are in memory representation of the persistent state. For example in case of Mongo it maps to the Mongo document in <tt>nodes</tt> collection and for RDB its maps to the row in <tt>NODES</tt> table. There is a class of <tt>NodeDocument</tt> (leaf level split documents) which, since <tt>1.3.15</tt> are cached under <tt>prevDocCache</tt> (see below)</p>
-<p>Depending on the <tt>DocumentStore</tt> implementation different heuristics are applied for invalidating the cache entries based on changes in backend </p></li>
-  
+<p>Depending on the <tt>DocumentStore</tt> implementation different heuristics are applied for invalidating the cache entries based on changes in backend</p>
+</li>
 <li>
+
 <p><tt>prevDocCache</tt> - Previous document cache is used for caching the <tt>NodeDocument</tt> instance representing leaf level split documents. Unlike other type of <tt>NodeDocument</tt>, these are immutable and hence don&#x2019;t require invalidation. If configured, this cache can exploit persistent cache as well. Similar to other <tt>NodeDocument</tt> these are also in memory representation of the persistent state. (since <tt>1.3.15</tt>)</p>
-<p>Depending on the <tt>DocumentStore</tt> implementation different heuristics are applied for invalidating the cache entries based on changes in backend </p></li>
-  
+<p>Depending on the <tt>DocumentStore</tt> implementation different heuristics are applied for invalidating the cache entries based on changes in backend</p>
+</li>
 <li>
-<p><tt>docChildrenCache</tt> - Document Children cache is used to cache the children state for a given parent node. This is invalidated completely upon every background read. This cache was removed in 1.5.6.</p></li>
-  
+
+<p><tt>docChildrenCache</tt> - Document Children cache is used to cache the children state for a given parent node. This is invalidated completely upon every background read. This cache was removed in 1.5.6.</p>
+</li>
 <li>
-<p><tt>nodeCache</tt> - Node cache is used to cache the <tt>DocumentNodeState</tt> instances. These are <b>immutable</b> view of <tt>NodeDocument</tt> as seen at a given revision hence no consistency checks are to be performed for them</p></li>
-  
+
+<p><tt>nodeCache</tt> - Node cache is used to cache the <tt>DocumentNodeState</tt> instances. These are <b>immutable</b> view of <tt>NodeDocument</tt> as seen at a given revision hence no consistency checks are to be performed for them</p>
+</li>
 <li>
-<p><tt>childrenCache</tt> - Children cache is used to cache the children for a given node. These are also <b>immutable</b> and represent the state of children for a given parent at certain revision</p></li>
-  
+
+<p><tt>childrenCache</tt> - Children cache is used to cache the children for a given node. These are also <b>immutable</b> and represent the state of children for a given parent at certain revision</p>
+</li>
 <li>
-<p><tt>diffCache</tt> - Caches the diff for the changes done between successive revision.  For local changes done the diff is add to the cache upon commit while for  external changes the diff entries are added upon computation of diff as part  of observation call</p></li>
+
+<p><tt>diffCache</tt> - Caches the diff for the changes done between successive revision. For local changes done the diff is add to the cache upon commit while for external changes the diff entries are added upon computation of diff as part of observation call</p>
+</li>
 </ol>
-<p>All the above caches are managed on heap. For the last 3 <tt>nodeCache</tt>, <tt>childrenCache</tt> and <tt>diffCache</tt> Oak provides support for <a href="persistent-cache.html">persistent cache</a>. By enabling the persistent cache feature Oak can manage a much larger cache off heap and thus avoid freeing up heap memory for application usage.</p>
+<p>All the above caches are managed on heap. For the last 3 <tt>nodeCache</tt>, <tt>childrenCache</tt> and <tt>diffCache</tt> Oak provides support for [persistent cache] (persistent-cache.html). By enabling the persistent cache feature Oak can manage a much larger cache off heap and thus avoid freeing up heap memory for application usage.</p>
 <div class="section">
 <h3><a name="Cache_Invalidation"></a><a name="cache-invalidation"></a> Cache Invalidation</h3>
-<p><tt>documentCache</tt> and <tt>docChildrenCache</tt> are containing mutable state which requires consistency checks to be performed to keep them in sync with the backend persisted state. Oak uses a MVCC model under which it maintains a consistent view of a given Node at a given revision. This allows using local cache instead of using a global clustered cache where changes made by any other cluster node need not be instantly reflected on all other nodes. </p>
-<p>Each cluster node periodically performs <a href="#bg-read">background reads</a> to pickup changes done by other cluster nodes. At that time it performs <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-1156">consistency check</a> to ensure that cached instance state reflect the state in the backend persisted state. Performing the check would take some time would be proportional number of entries present in the cache. </p>
+<p><tt>documentCache</tt> and <tt>docChildrenCache</tt> are containing mutable state which requires consistency checks to be performed to keep them in sync with the backend persisted state. Oak uses a MVCC model under which it maintains a consistent view of a given Node at a given revision. This allows using local cache instead of using a global clustered cache where changes made by any other cluster node need not be instantly reflected on all other nodes.</p>
+<p>Each cluster node periodically performs <a href="#bg-read">background reads</a> to pickup changes done by other cluster nodes. At that time it performs [consistency check] <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-1156">OAK-1156</a> to ensure that cached instance state reflect the state in the backend persisted state. Performing the check would take some time would be proportional number of entries present in the cache.</p>
 <p>For repository to work properly its important to ensure that such background reads do not consume much time and <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-2646">work is underway</a> to improve current approach. <i>To ensure that such background operation (which include the cache invalidation checks) perform quickly one should not set a large size for these caches</i>.</p>
-<p>All other caches consist of immutable state and hence no cache invalidation needs to be performed for them. For that reason those caches can be backed by persistent cache and even having large number of entries in such caches would not be a matter of concern. </p></div>
+<p>All other caches consist of immutable state and hence no cache invalidation needs to be performed for them. For that reason those caches can be backed by persistent cache and even having large number of entries in such caches would not be a matter of concern.</p></div>
 <div class="section">
 <h3><a name="Cache_Configuration"></a><a name="cache-configuration"></a> Cache Configuration</h3>
 <p>In a default setup the <a href="../osgi_config.html#document-node-store">DocumentNodeStoreService</a> takes a single config for <tt>cache</tt> which is internally distributed among the various caches above in following way</p>
-
 <ol style="list-style-type: decimal">
-  
+
 <li><tt>nodeCache</tt> - 35% (was 25% until 1.5.14)</li>
-  
 <li><tt>prevDocCache</tt> - 4%</li>
-  
 <li><tt>childrenCache</tt> - 15% (was 10% until 1.5.14)</li>
-  
 <li><tt>diffCache</tt> - 30% (was 4% until 1.5.14)</li>
-  
 <li><tt>documentCache</tt> - Is given the rest i.e. 16%</li>
-  
 <li><tt>docChildrenCache</tt> - 0% (removed in 1.5.6, default was 3%)</li>
 </ol>
 <p>Lately <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-2546">options are provided</a> to have a fine grained control over the distribution. See <a href="../osgi_config.html#cache-allocation">Cache Allocation</a></p>
@@ -740,9 +751,11 @@ settings
 <p>On startup the DocumentNodeStore checks if its version is compatible with the format version currently in use. A read-only DocumentNodeStore can read the current version as well as older versions. A read-write DocumentNodeStore on the other hand can only write to the DocumentStore when the format version matches its own version. The DocumentNodeStore maintains this format version in the <tt>settings</tt> collection accessible to all cluster nodes.</p>
 <p>Upgrading to a newer Oak version may therefore first require an update of the format version before a newer version of a DocumentNodeStore can be started on existing data. The oak-run tools contains an <tt>unlockUpgrade</tt> mode to perform this operation. Use the oak-run tool with the version matching the target upgrade version to unlock an upgrade with the following command. The below example unlocks an upgrade to 1.8 with a DocumentNodeStore on MongoDB:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">&gt; java -jar oak-run-1.8.0.jar unlockUpgrade mongodb://example.com:27017/oak
+<div>
+<div>
+<pre class="source">&gt; java -jar oak-run-1.8.0.jar unlockUpgrade mongodb://example.com:27017/oak
 </pre></div></div>
+
 <p>Please note that unlocking an upgrade is only possible when all cluster nodes are inactive, otherwise the command will refuse to change the format version.</p>
 <p>See also detailed instructions for various <a href="document/upgrade.html">upgrade</a> paths.</p></div>
 <div class="section">
@@ -752,19 +765,17 @@ settings
 <div class="section">
 <h2><a name="Revision_Garbage_Collection"></a><a name="revision-gc"></a> Revision Garbage Collection</h2>
 <p>As described in the section <a href="#node-content-model">Node Content Model</a>, the DocumentNodeStore does not overwrite existing data but adds it to an existing document when a property is updated. Cleaning up old data, which is not needed anymore is done with a process called <tt>Revision Garbage Collection</tt>. Depending on deployment this process does not run automatically and must be triggered periodically by the application. The garbage collection process adds some pressure on the system, so the application should trigger it when it is most convenient. E.g. at night, when systems are usually not that busy. It is usually sufficient to run it once a day. There are several ways how the revision garbage collection can be triggered:</p>
-
 <ul>
-  
+
 <li>Call <tt>startRevisionGC()</tt> on the <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/api/jmx/RepositoryManagementMBean.html">RepositoryManagementMBean</a></li>
-  
 <li>Call <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/document/VersionGarbageCollector.html#gc-long-java.util.concurrent.TimeUnit-">gc()</a> on the <tt>VersionGarbageCollector</tt> obtained from the <tt>DocumentNodeStore</tt> instance</li>
-  
 <li>Use the oak-run runnable jar file with the <tt>revisions</tt> run mode (<tt>@since Oak 1.8</tt>).</li>
 </ul>
 <p>The first two options are not described in more detail, because both of them are simple method calls. The third option comes with some sub commands as described below when oak-run with the <tt>revisions</tt> run mode is invoked without parameters or options:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">revisions mongodb://host:port/database &lt;sub-command&gt; [options]
+<div>
+<div>
+<pre class="source">revisions mongodb://host:port/database &lt;sub-command&gt; [options]
 where sub-command is one of
   info     give information about the revisions state without performing
            any modifications
@@ -792,13 +803,20 @@ Option                 Description
                          seconds (default: -1)
 --verbose              print INFO messages to the console
 </pre></div></div>
+
 <p>A revision garbage collection can be invoked while the system is online and running. Using the oak-run runnable jar, a revision GC on a system using the MongoDB backend can be initiated with:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">java -jar oak-run-1.8.0.jar revisions mongodb://localhost:27017/oak collect
+<div>
+<div>
+<pre class="source">java -jar oak-run-1.8.0.jar revisions mongodb://localhost:27017/oak collect
 </pre></div></div>
+
 <p>This will collect changes identified as garbage, which is older than 24 hours.</p>
-<p>Starting with Oak 1.8 the DocumentNodeStoreService can trigger Revision Garbage Collection (RGC) automatically. The default schedule depends on the type of backend. On RDB the service will not schedule a RGC, which is the same behavior as in previous Oak versions. Whereas on MongoDB the RGC runs every five seconds. The latter is also known as <tt>Continuous Revision Garbage Collection</tt>. In this mode, the RGC will not log every run but only write an INFO message every hour summarizing the GC cycles for the past hour. For more details, see also the <a href="../osgi_config.html#document-node-store">OSGi configuration</a> page. </p></div>
+<p>Starting with Oak 1.8 the DocumentNodeStoreService can trigger Revision Garbage Collection (RGC) automatically. The default schedule depends on the type of backend. On RDB the service will not schedule a RGC, which is the same behavior as in previous Oak versions. Whereas on MongoDB the RGC runs every five seconds. The latter is also known as <tt>Continuous Revision Garbage Collection</tt>. In this mode, the RGC will not log every run but only write an INFO message every hour summarizing the GC cycles for the past hour. For more details, see also the <a href="../osgi_config.html#document-node-store">OSGi configuration</a> page.</p></div>
+<div class="section">
+<h2><a name="Pending_Topics"></a><a name="pending-topics"></a> Pending Topics</h2>
+<div class="section">
+<h3><a name="Conflict_Detection_and_Handling"></a><a name="conflict-detection-and-handling"></a> Conflict Detection and Handling</h3></div></div>
         </div>
       </div>
     </div>

Modified: jackrabbit/site/live/oak/docs/nodestore/overview.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/overview.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/overview.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/overview.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Node Storage</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -241,7 +241,8 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><h1>Node Storage</h1>
+  -->
+<h1>Node Storage</h1>
 <p>Oak comes with two node storage flavours: <a href="segment/overview.html">Segment</a> and <a href="documentmk.html">Document</a>. Segment storage is optimised for maximal performance in standalone deployments, and document storage is optimised for maximal scalability in clustered deployments.</p>
 <div class="section">
 <h2><a name="NodeStore_API"></a>NodeStore API</h2>

Modified: jackrabbit/site/live/oak/docs/nodestore/persistent-cache.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/persistent-cache.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/persistent-cache.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/persistent-cache.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Persistent Cache</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -241,80 +241,99 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><div class="section">
+  -->
+<div class="section">
 <h2><a name="Persistent_Cache"></a>Persistent Cache</h2>
 <p>The document storage optionally uses the persistent cache. The cache acts like an in-memory cache for old revisions, but in addition to keeping the most recently used nodes in memory, it also stores them to disk. That way, many reads from the storage backend (for example MongoDB) are replaced by reads from the local disk. This is specially useful if reads from the local disk are faster than reads from the storage backend. In addition to that, the persistent cache reduces the load on the storage backend.</p>
 <div class="section">
 <h3><a name="aOSGi_Configuration"></a>&#xa0;OSGi Configuration</h3>
 <p>The default OSGi configuration of the persistent cache is:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService
+<div>
+<div>
+<pre class="source">org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService
     persistentCache=&quot;cache&quot;
 </pre></div></div>
+
 <p>Oak version up to 1.4 have the persistent cache disabled by default, which is equivalent with a configuration entry set to an empty String. Starting with Oak 1.6, the persistent cache is enabled by default and can be disabled by setting the configuration entry to <tt>&quot;-&quot;</tt>.</p></div>
 <div class="section">
 <h3><a name="Configuration_Options"></a>Configuration Options</h3>
 <p>The persistent cache configuration setting is string with a number of comma separated elements. The first element is the directory where the cache is stored. Example:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">&quot;cache&quot;
+<div>
+<div>
+<pre class="source">&quot;cache&quot;
 </pre></div></div>
+
 <p>In this case, the data is stored in the directory &#x201c;cache&#x201d;, relative to the <tt>repository.home</tt> directory. If no repository home directory is configured, the directory is relative to the current working directory. Oak versions prior to 1.6 always resolve to the current working directory and ignore the <tt>repository.home</tt> configuration. By default, there are at most two files (two generations) with the name &#x201c;cache-x.data&#x201d;, where x is an incrementing number (0, 1,&#x2026;). A file is at most 1 GB by default. If the file is larger, the next file is created, and if there are more than two files, the oldest one is removed. If data from the older file is accessed, it is copied to the latest file. That way, data that is not recently read will eventually be removed.</p>
 <p>The following other configuration options are available:</p>
-
 <ul>
-  
+
 <li>
-<p>Size. A file is at most 1 GB by default. To change maximum size of a file, use &#x201c;size=x&#x201d;, where x is the size in MB.</p></li>
-  
+
+<p>Size. A file is at most 1 GB by default. To change maximum size of a file, use &#x201c;size=x&#x201d;, where x is the size in MB.</p>
+</li>
 <li>
-<p>Node caching. By default, nodes at all revisions are cached. To disable this option, use &#x201c;-nodes&#x201d;.</p></li>
-  
+
+<p>Node caching. By default, nodes at all revisions are cached. To disable this option, use &#x201c;-nodes&#x201d;.</p>
+</li>
 <li>
-<p>Children caching. By default, the list of children of a node is cached. To disable this option, use &#x201c;-children&#x201d;.</p></li>
-  
+
+<p>Children caching. By default, the list of children of a node is cached. To disable this option, use &#x201c;-children&#x201d;.</p>
+</li>
 <li>
-<p>Diff caching. By default, the list of differences between two revisions is cached. To disable this option, use &#x201c;-diff&#x201d;.</p></li>
-  
+
+<p>Diff caching. By default, the list of differences between two revisions is cached. To disable this option, use &#x201c;-diff&#x201d;.</p>
+</li>
 <li>
-<p>Compaction. The cache file can be compacted and compressed (at a rate of around 100 MB per second) when it is closed. That way, the disk space is used more efficiently. To enable this option, use &#x201c;+compact&#x201d;. (Please note this feature was enabled by default in versions 1.2.1, 1.0.13, and older.)</p></li>
-  
+
+<p>Compaction. The cache file can be compacted and compressed (at a rate of around 100 MB per second) when it is closed. That way, the disk space is used more efficiently. To enable this option, use &#x201c;+compact&#x201d;. (Please note this feature was enabled by default in versions 1.2.1, 1.0.13, and older.)</p>
+</li>
 <li>
-<p>Compression. By default, the cache is compressed, saving space. To disable this option, use &#x201c;-compress&#x201d;.</p></li>
-  
+
+<p>Compression. By default, the cache is compressed, saving space. To disable this option, use &#x201c;-compress&#x201d;.</p>
+</li>
 <li>
-<p>Binary caching (removed in Oak 1.10). When using the BlobStore, binaries smaller than 1 MB are stored in the persistent cache by default. The maximum size can be changed using the setting &#x201c;binary=x&#x201d;, where x is the size in bytes. To disable the binary cache, use &#x201c;binary=0&#x201d;.</p></li>
+
+<p>Binary caching (removed in Oak 1.10). When using the BlobStore, binaries smaller than 1 MB are stored in the persistent cache by default. The maximum size can be changed using the setting &#x201c;binary=x&#x201d;, where x is the size in bytes. To disable the binary cache, use &#x201c;binary=0&#x201d;.</p>
+</li>
 </ul>
 <p>Those setting can be appended to the persistent cache configuration string. An example configuration is:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">&quot;cache,size\=2048,-compact,-compress&quot;
+<div>
+<div>
+<pre class="source">&quot;cache,size\=2048,-compact,-compress&quot;
 </pre></div></div>
+
 <p>To disable the persistent cache entirely in Oak 1.6 and newer, use the following configuration:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService
+<div>
+<div>
+<pre class="source">org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService
     persistentCache=&quot;-&quot;
 </pre></div></div>
+
 <p>Up to Oak version 1.4, either omit the persistentCache entry or set it to an empty String to disable the persistent cache.</p></div>
 <div class="section">
 <h3><a name="Journal_cache"></a>Journal cache</h3>
 <p>Since Oak 1.6.</p>
 <p>Diff cache entries can also are stored in a separate persistent cache and configured independently if needed. This can be done in the OSGi configuration like in the following example:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService
+<div>
+<div>
+<pre class="source">org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService
     persistentCache=&quot;cache,size\=2048&quot;
     journalCache=&quot;diff-cache,size\=1024&quot;
 </pre></div></div>
+
 <p>The configuration options are the same as for the <tt>persistentCache</tt>, but options unrelated to the diff cache type are ignored. The default configuration is <tt>journalCache=&quot;diff-cache&quot;</tt> and can be disabled the same way as the regular persistent cache with a dash: <tt>journalCache=&quot;-&quot;</tt>.</p></div>
 <div class="section">
 <h3><a name="aDependencies"></a>&#xa0;Dependencies</h3>
 <p>Internally, the persistent cache uses a key-value store (basically a java.util.Map), which is persisted to disk. The current key-value store backend is the <a class="externalLink" href="http://www.h2database.com/html/mvstore.html">H2 MVStore</a>. This library is only needed if the persistent cache is configured. Version 1.4.185 or newer is needed.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">&lt;dependency&gt;
+<div>
+<div>
+<pre class="source">&lt;dependency&gt;
     &lt;groupId&gt;com.h2database&lt;/groupId&gt;
     &lt;artifactId&gt;h2-mvstore&lt;/artifactId&gt;
     &lt;version&gt;1.4.185&lt;/version&gt;

Modified: jackrabbit/site/live/oak/docs/nodestore/segment/changes.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/segment/changes.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/segment/changes.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/segment/changes.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Changes in the data format</title>
     <link rel="stylesheet" href="../../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -240,15 +240,14 @@
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
---><h1>Changes in the data format</h1>
+-->
+<h1>Changes in the data format</h1>
 <p>This document describes the changes in the storage format introduced by the Oak Segment Tar module. The purpose of this document is not only to enumerate such changes, but also to explain the rationale behind them. Pointers to Jira issues are provided for a much more terse description of changes. Changes are presented in chronological order.</p>
 <div class="section">
 <h2><a name="Generation_in_segment_headers"></a>Generation in segment headers</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3348">OAK-3348</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.2</li>
 </ul>
 <p>The GC algorithm implemented by Oak Segment Tar is based on the fundamental idea of grouping records into generations. When GC is performed, records belonging to older generations can be removed, while records belonging to newer generations have to be retained.</p>
@@ -256,24 +255,20 @@
 <p>The original specification of the data format for the segment header left some space for future extensions. In the new format the generation is saved at offsets 10 to 13 as a 4-byte integer value.</p></div>
 <div class="section">
 <h2><a name="Stable_identifiers"></a>Stable identifiers</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3348">OAK-3348</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.2</li>
 </ul>
 <p>The fastest way to compare two node records is to compare their addresses. If their addresses are equal, the two node records are guaranteed to be equal. Transitively, given that records are immutable, the subtrees identified by those node records are guaranteed to be equal.</p>
 <p>The situation gets more complicated when the generation-based GC algorithm copies a node record over to a new generation to save it from being deleted. In this situation, two copies of the same node record live in two different generations, in two different segments and at two different addresses. To figure out whether such two node records are equal it is not sufficient to compare their addresses.</p>
-<p>To overcome this problem, a stable identifier has been added to every node record: when a new node record is serialized, the address it is serialized to becomes its stable identifier. The stable identifier is included in the node record and becomes part of its serialized format. When the node record is copied to a new generation and a new segment, its address will inevitably change. The stable identifier instead, being part of the node record itself, will not change. This enables fast comparison between different copies of the same node records by just comparing their stable identifiers. </p>
+<p>To overcome this problem, a stable identifier has been added to every node record: when a new node record is serialized, the address it is serialized to becomes its stable identifier. The stable identifier is included in the node record and becomes part of its serialized format. When the node record is copied to a new generation and a new segment, its address will inevitably change. The stable identifier instead, being part of the node record itself, will not change. This enables fast comparison between different copies of the same node records by just comparing their stable identifiers.</p>
 <p>The stable identifier is serialized as a 18-bytes-long string record. This record, in turn, is referenced from the node record by adding an additional 3-bytes-long reference field to it. In conclusion, stable identifiers add an overhead of 21 bytes to every node record in the worst case. In the best case, the 18-bytes-long string record is shared between node records when possible, so the aforementioned overhead represents an upper limit.</p></div>
 <div class="section">
 <h2><a name="Binary_references_index"></a>Binary references index</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-4201">OAK-4201</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.4</li>
 </ul>
 <p>The original data format in Oak Segment mandates that every segment maintains a list of references to external binaries. Every time a record references an external binary - i.e. a piece of binary data that is stored in a Blob Store - a new binary reference is added to its segment. The list of references to external binaries is inspected periodically by the Blob Store GC algorithm to determine which binaries are currently in use. The Blob Store GC algorithm removes every binary that is not reported as used by the Segment Store.</p>
@@ -281,50 +276,42 @@
 <p>To make this process faster and and ease the pressure on I/O, Oak Segment Tar introduces an index of references to external binaries in every TAR file. This index aggregates the required information from every segment contained in a TAR file. When Blob Store GC is performed, instead of reading and parsing every segment, it can read and parse the index files. This optimization reduces the amount of I/O operations significantly.</p></div>
 <div class="section">
 <h2><a name="Simplified_segment_and_record_format"></a>Simplified segment and record format</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-4631">OAK-4631</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.10</li>
 </ul>
 <p>The former data format limited the number of references to other segments a segment could have. This limitation caused sub-optimal segment space utilization when a record referencing data from many different segments was written. In this case records quickly exhausted the hard limit on the number of references to other segments, causing a premature flush of a non-full segment.</p>
 <p>Oak Segment Tar relaxed the limit on the number of segments to the point that it can now be considered irrelevant. This avoids the problem of non optimal segment space utilization. Tests show that with this change in place it is possible to store the same amount of data in a smaller amount of better utilized segments.</p>
-<p>The Jira issue referenced in this paragraph proposes other changes other than the one discussed here. Most of the changes proposed by the issue were subsequently reverted or never made in the code base because of their high toll on disk space. The comments on the issue and the referenced email thread provide a more detailed insight into the various trade-offs and considerations. </p></div>
+<p>The Jira issue referenced in this paragraph proposes other changes other than the one discussed here. Most of the changes proposed by the issue were subsequently reverted or never made in the code base because of their high toll on disk space. The comments on the issue and the referenced email thread provide a more detailed insight into the various trade-offs and considerations.</p></div>
 <div class="section">
 <h2><a name="Storage_format_versioning"></a>Storage format versioning</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-4295">OAK-4295</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.10</li>
 </ul>
 <p>To avoid the (old) Oak Segment and the (new) Oak Segment Tar to step on each other&#x2019;s toes, an improved versioning mechanism of the data format was introduced.</p>
-<p>First of all, the version field in the segment header has been incremented from 11 in Oak Segment to 12 in Oak Segment Tar. This prevents Oak Segment Tar from accessing segments written by older implementations and Oak Segment accessing segments written by newer implementations. </p>
+<p>First of all, the version field in the segment header has been incremented from 11 in Oak Segment to 12 in Oak Segment Tar. This prevents Oak Segment Tar from accessing segments written by older implementations and Oak Segment accessing segments written by newer implementations.</p>
 <p>This strategy has been further improved by adding a manifest file in every data folder created by Oak Segment Tar. The manifest file is supposed to be a source of metadata for the whole repository. Oak Segment Tar checks for the presence of a manifest file very time a data folder is open. If a manifest file is there, the metadata has to be compatible with the current version of the currently executing code.</p>
 <p>Repositories written by Oak Segment do not generate a manifest file while those written by Oak Segment Tar do. This difference enables a fail-fast approach: when Oak Segment opens a data folder containing a manifest, it immediately fails complaining that the data format is too new. When Oak Segment Tar opens a non-empty data folder without a manifest, it immediately fails complaining that the data format is too old.</p></div>
 <div class="section">
 <h2><a name="Logic_record_IDs"></a>Logic record IDs</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-4659">OAK-4659</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.14</li>
 </ul>
 <p>In the previous implementation (Oak Segment) the position of a record in its segment is fixed. Once written, its address consists of the identifier of its segment followed by its offset within the segment. The offset is the effective position of the record in the segment.</p>
-<p>This way of addressing records implies that a record can&#x2019;t be moved within a segment without changing its address. Moving a record means changing its segment, its position or both and results in all reference to it being broken. </p>
+<p>This way of addressing records implies that a record can&#x2019;t be moved within a segment without changing its address. Moving a record means changing its segment, its position or both and results in all reference to it being broken.</p>
 <p>To gain more flexibility for storing records, a new level of indirection was introduced replacing offsets with logic identifiers. Instead of referencing a record by a segment identifier and its offset in the segment, a segment identifier and a record number is used. The record number is a logic address for a record in the segment and is local to the segment.</p>
 <p>With this solution the record can be moved within the segment without breaking references to it. This change enables a number of different algorithms when it comes to garbage collection. For example, some records can now be removed from a segment and the segment can be shrunk down by moving every remaining record next to each other. This operation would change the position of the remaining record in the segment, but not their logic record identifier.</p>
 <p>This change introduced a new translation table in the segment header to map record numbers to record offsets. The table occupies 9 bytes per record (4 bytes for the record number, 1 byte for the record type and 4 bytes for the record offset). Moreover, a new 4-bytes-long integer field has been added to the segment header containing the number of entries of the translation table.</p></div>
 <div class="section">
 <h2><a name="Root_record_types"></a>Root record types</h2>
-
 <ul>
-  
+
 <li>Jira issue: <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-2498">OAK-2498</a></li>
-  
 <li>Since: Oak Segment Tar 0.0.16</li>
 </ul>
 <p>The record number translation table mentioned in the previous paragraph contains a 1-byte field for every record. This field determines the type of the record referenced by that row of the table. The change in this paragraph is about improving the information stored in the type field of the record number translation table.</p>

Modified: jackrabbit/site/live/oak/docs/nodestore/segment/classes.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/segment/classes.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/segment/classes.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/segment/classes.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Design of Oak Segment Tar</title>
     <link rel="stylesheet" href="../../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -240,15 +240,16 @@
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
---><h1>Design of Oak Segment Tar</h1>
-<p>This section gives a high level overview of the design of Oak Segment Tar, its most important classes, their purpose and relationship. More in depth information is available from the Javadoc of the individual classes. </p>
+-->
+<h1>Design of Oak Segment Tar</h1>
+<p>This section gives a high level overview of the design of Oak Segment Tar, its most important classes, their purpose and relationship. More in depth information is available from the Javadoc of the individual classes.</p>
 <div class="section">
 <h2><a name="Overview"></a>Overview</h2>
 <p><img src="classes.png" alt="Class diagram" /></p>
-<p>The <tt>SegmentNodeStore</tt> is Oak Segment Tar&#x2019;s implementation of the <a href="../overview.html">NodeStore API</a>. It uses a <tt>Revisions</tt> instance for accessing and setting the current head state, a <tt>SegmentReader</tt> for reading records from segments, a <tt>SegmentWriter</tt> for writing records to segments and a <tt>BlobStore</tt> for reading and writing binaries. </p>
-<p>The <tt>SegmentStore</tt> serves as a persistence backend for the <tt>SegmentNodeStore</tt>. It is responsible for providing concrete implementations of <tt>Revisions</tt>, <tt>SegmentReader</tt> and <tt>BlobStore</tt> to the former. </p>
+<p>The <tt>SegmentNodeStore</tt> is Oak Segment Tar&#x2019;s implementation of the <a href="../overview.html">NodeStore API</a>. It uses a <tt>Revisions</tt> instance for accessing and setting the current head state, a <tt>SegmentReader</tt> for reading records from segments, a <tt>SegmentWriter</tt> for writing records to segments and a <tt>BlobStore</tt> for reading and writing binaries.</p>
+<p>The <tt>SegmentStore</tt> serves as a persistence backend for the <tt>SegmentNodeStore</tt>. It is responsible for providing concrete implementations of <tt>Revisions</tt>, <tt>SegmentReader</tt> and <tt>BlobStore</tt> to the former.</p>
 <p>The <tt>FileStore</tt> is the implementation the <tt>SegmentStore</tt> that persists segments in tar files. The <tt>MemoryStore</tt> (not shown above) is an alternative implementation, which stores the segments in memory only. It is used for testing.</p>
-<p>The <tt>FileStore</tt> depends on <tt>TarFiles</tt> for the management of the TAR files on the file system. <tt>TarFiles</tt> is an aggregation of one <tt>TarWriter</tt> and zero or more <tt>TarReader</tt>. This design represents the foundation of the append-only store implemented by the <tt>FileStore</tt>, where data is appended to one <tt>TarWriter</tt> and archived in many <tt>TarReader</tt> over time. </p></div>
+<p>The <tt>FileStore</tt> depends on <tt>TarFiles</tt> for the management of the TAR files on the file system. <tt>TarFiles</tt> is an aggregation of one <tt>TarWriter</tt> and zero or more <tt>TarReader</tt>. This design represents the foundation of the append-only store implemented by the <tt>FileStore</tt>, where data is appended to one <tt>TarWriter</tt> and archived in many <tt>TarReader</tt> over time.</p></div>
         </div>
       </div>
     </div>



Mime
View raw message