jackrabbit-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thom...@apache.org
Subject svn commit: r1858009 [1/2] - in /jackrabbit/site/live/oak/docs: ./ nodestore/document/ nodestore/segment/ query/ security/accesscontrol/ security/authorization/
Date Tue, 23 Apr 2019 13:50:48 GMT
Author: thomasm
Date: Tue Apr 23 13:50:48 2019
New Revision: 1858009

URL: http://svn.apache.org/viewvc?rev=1858009&view=rev
Log:
OAK-936: Site checkin for project Oak Documentation-1.14-SNAPSHOT

Added:
    jackrabbit/site/live/oak/docs/nodestore/segment/onrc-memoirs.html
Modified:
    jackrabbit/site/live/oak/docs/diagnostic-builds.html
    jackrabbit/site/live/oak/docs/nodestore/document/rdb-document-store.html
    jackrabbit/site/live/oak/docs/nodestore/segment/overview.html
    jackrabbit/site/live/oak/docs/query/lucene.html
    jackrabbit/site/live/oak/docs/release-schedule.html
    jackrabbit/site/live/oak/docs/security/accesscontrol/default.html
    jackrabbit/site/live/oak/docs/security/authorization/cug.html

Modified: jackrabbit/site/live/oak/docs/diagnostic-builds.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/diagnostic-builds.html?rev=1858009&r1=1858008&r2=1858009&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/diagnostic-builds.html (original)
+++ jackrabbit/site/live/oak/docs/diagnostic-builds.html Tue Apr 23 13:50:48 2019
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2019-01-08 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2019-04-23 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20190108" />
+    <meta name="Date-Revision-yyyymmdd" content="20190423" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Cutting diagnostic builds</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.6.min.css" />
@@ -142,9 +142,9 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2019-01-08<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2019-04-23<span class="divider">|</span>
 </li>
-          <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
+          <li id="projectVersion">Version: 1.14-SNAPSHOT</li>
         </ul>
       </div>
       <div class="row-fluid">
@@ -169,6 +169,7 @@
     <li><a href="nodestore/documentmk.html" title="Document NodeStore"><span class="icon-chevron-down"></span>Document NodeStore</a>
       <ul class="nav nav-list">
     <li><a href="nodestore/document/mongo-document-store.html" title="MongoDB DocumentStore"><span class="none"></span>MongoDB DocumentStore</a>  </li>
+    <li><a href="nodestore/document/rdb-document-store.html" title="RDB DocumentStore"><span class="none"></span>RDB DocumentStore</a>  </li>
     <li><a href="nodestore/document/node-bundling.html" title="Node Bundling"><span class="none"></span>Node Bundling</a>  </li>
     <li><a href="nodestore/document/secondary-store.html" title="Secondary Store"><span class="none"></span>Secondary Store</a>  </li>
     <li><a href="nodestore/persistent-cache.html" title="Persistent Cache"><span class="none"></span>Persistent Cache</a>  </li>
@@ -256,10 +257,46 @@
 <h1>Cutting diagnostic builds</h1>
 <p>The cutting of a diagnostic build, is the process where you want to deliver one or more oak bundles, let&#x2019;s say <tt>oak-core</tt> into a specific environment in order to assess whether it actually solves the issues.</p>
 <p>What you are aiming is to eventually produce a bundle in the format of, for example, <tt>oak-core-1.0.22-R2707077</tt>.</p>
-<p>Let&#x2019;s see it through an example.</p>
+<p>Let&#x2019;s see it through examples. We&#x2019;ll consider the case for <b>Branches</b> and <b>Trunk</b>.</p>
+<div class="section">
+<h2><a name="Trunk"></a>Trunk</h2>
+<p>We want to produce a diagnostic build of <tt>oak-core</tt> for what it will be Oak <b>1.16.0</b>. It means we currently have in our <tt>pom.xml</tt> a version of <tt>&lt;version&gt;1.16-SNAPSHOT&lt;/version&gt;</tt>.</p>
+<div class="section">
+<h3><a name="What_version_shall_I_use"></a>What version shall I use?</h3>
+<p>Open the svn directory where trunk is and issue a</p>
+
+<div>
+<div>
+<pre class="source">$ svn up
+$ svn info
+</pre></div></div>
+
+<p>you will see something like</p>
+
+<div>
+<div>
+<pre class="source">Working Copy Root Path: /apache/oak-svn-1.0
+URL: https://svn.apache.org/repos/asf/jackrabbit/oak/branches/1.0
+Repository Root: https://svn.apache.org/repos/asf
+Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
+Revision: 1708581
+Node Kind: directory
+Schedule: normal
+Last Changed Author: chetanm
+Last Changed Rev: 1708547
+Last Changed Date: 2015-10-14 06:56:40 +0100 (Wed, 14 Oct 2015)
+</pre></div></div>
+
+<p>what you&#x2019;re interested is the revision number. In our case: <tt>1708581</tt>.</p>
+<p>This means you&#x2019;ll produce a bundle with a version of <tt>1.15-R2708581</tt>.</p>
+<p><b>Note that the produced version is lower then the official release you&#x2019;re working on. 1.15 vs 1.16.0</b></p>
+<p><b>Note to use the &#x2018;-R&#x2019; (uppercase) instead of &#x2018;-r&#x2019; (lowercase) as it will be lower than &#x2018;-SNAPSHOT&#x2019;. Doing otherwise will result in troubles when trying to apply a &#x2018;-SNAPSHOT&#x2019; version on top of the internal build</b></p>
+<p>If you&#x2019;re in doubt about what versioning and how OSGi or Maven will behave have a look at the <a class="externalLink" href="http://versionatorr.appspot.com/">Versionatorr App</a>. You want your diagnostic build to be <b>always less than</b> the oak version where your fix is going to be released.</p></div></div>
+<div class="section">
+<h2><a name="Branches"></a>Branches</h2>
 <p>We want to produce a diagnostic build of <tt>oak-core</tt> for what it will be Oak <b>1.0.23</b>. It means we currently have in our <tt>pom.xml</tt> a version of <tt>&lt;version&gt;1.0.23-SNAPSHOT&lt;/version&gt;</tt>.</p>
 <div class="section">
-<h2><a name="What_version_shall_I_use"></a>What version shall I use?</h2>
+<h3><a name="What_version_shall_I_use"></a>What version shall I use?</h3>
 <p>Open the svn directory where the 1.0 branch is and issue a</p>
 
 <div>
@@ -288,10 +325,12 @@ Last Changed Date: 2015-10-14 06:56:40 +
 <p>This means you&#x2019;ll produce a bundle with a version of <tt>1.0.22-R2708581</tt>.</p>
 <p><b>Note that the produced version is lower then the official release you&#x2019;re working on. 1.0.22 vs 1.0.23</b></p>
 <p><b>Note to use the &#x2018;-R&#x2019; (uppercase) instead of &#x2018;-r&#x2019; (lowercase) as it will be lower than &#x2018;-SNAPSHOT&#x2019;. Doing otherwise will result in troubles when trying to apply a &#x2018;-SNAPSHOT&#x2019; version on top of the internal build</b></p>
-<p>If you&#x2019;re in doubt about what versioning and how OSGi or Maven will behave have a look at the <a class="externalLink" href="http://versionatorr.appspot.com/">Versionatorr App</a>. You want your diagnostic build to be <b>always less than</b> the oak version where your fix is going to be released.</p></div>
+<p>If you&#x2019;re in doubt about what versioning and how OSGi or Maven will behave have a look at the <a class="externalLink" href="http://versionatorr.appspot.com/">Versionatorr App</a>. You want your diagnostic build to be <b>always less than</b> the oak version where your fix is going to be released.</p></div></div>
+<div class="section">
+<h2><a name="Both_Branches_and_Trunk_same_process"></a>Both Branches and Trunk (same process)</h2>
 <div class="section">
-<h2><a name="Changing_the_version_in_all_the_poms."></a>Changing the version in all the poms.</h2>
-<p>Now that you know you want to produce <tt>1.0.22-R2708581</tt> you have to change all the poms accordingly.</p>
+<h3><a name="Changing_the_version_in_all_the_poms."></a>Changing the version in all the poms.</h3>
+<p>Now. From our examples above you either want to produce <tt>1.0.22-R2708581</tt> or <tt>1.15-R2708581</tt>. For sake of simplicty we&#x2019;ll detail only the <tt>1.0.22-R2708581</tt> case. For <tt>1.15-R2708581</tt> you simply have to change the version.</p>
 <p>Go into <tt>oak-parent</tt> and issue the following maven command.</p>
 
 <div>
@@ -307,7 +346,7 @@ Last Changed Date: 2015-10-14 06:56:40 +
 </pre></div></div>
 </div>
 <div class="section">
-<h2><a name="Building_the_release"></a>Building the release</h2>
+<h3><a name="Building_the_release"></a>Building the release</h3>
 <p>Now you can build the release as usual</p>
 
 <div>
@@ -317,13 +356,13 @@ Last Changed Date: 2015-10-14 06:56:40 +
 
 <p>and you&#x2019;ll have a full oak build with the version <tt>1.0.22-R2708581</tt>. Go into <tt>oak-core/target</tt> and take the produced jar.</p></div>
 <div class="section">
-<h2><a name="Re-setting_the_svn_branch"></a>Re-setting the svn branch</h2>
+<h3><a name="Re-setting_the_svn_branch"></a>Re-setting the svn branch</h3>
 <p>You don&#x2019;t want to commit the changes back to svn so we reset the branch as the original state</p>
 
 <div>
 <div>
 <pre class="source">jackrabbit-oak$ mvn versions:revert
-</pre></div></div></div>
+</pre></div></div></div></div>
         </div>
       </div>
     </div>

Modified: jackrabbit/site/live/oak/docs/nodestore/document/rdb-document-store.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/document/rdb-document-store.html?rev=1858009&r1=1858008&r2=1858009&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/document/rdb-document-store.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/document/rdb-document-store.html Tue Apr 23 13:50:48 2019
@@ -144,7 +144,7 @@
         <ul class="breadcrumb">
         <li id="publishDate">Last Published: 2019-04-23<span class="divider">|</span>
 </li>
-          <li id="projectVersion">Version: 1.12-SNAPSHOT</li>
+          <li id="projectVersion">Version: 1.14-SNAPSHOT</li>
         </ul>
       </div>
       <div class="row-fluid">
@@ -284,16 +284,94 @@
 </pre></div></div>
 </div>
 <div class="section">
-<h2><a name="Database_Creation"></a>Database Creation</h2>
+<h2><a name="Database_Creation"></a><a name="database-creation"></a> Database Creation</h2>
 <p><tt>RDBDocumentStore</tt> relies on JDBC, and thus, in general, can not create database instances (that said, certain DBs such as Apache Derby or H2DB can create the database automatically when it&#x2019;s not there yet - consult the DB documentation in general and the JDBC URL syntax specifically).</p>
 <p>So in general, the administrator will have to take care of creating the database. There are only a few requirements for the database, but these are critical for the correct operation:</p>
 <ul>
 
 <li>character fields must be able to store any Unicode code point - UTF-8 encoding is recommended</li>
 <li>the collation for character fields needs to sort by Unicode code points</li>
-</ul></div>
+<li>BLOBs need to support sizes of ~16MB</li>
+</ul>
+<p>The subsections below give examples that have been found to work during the development of <tt>RDBDocumentStore</tt>.</p>
+<div class="section">
+<h3><a name="DB2"></a><a name="database-creation-db2"></a> DB2</h3>
+<p>Creating a database called <tt>OAK</tt>:</p>
+
+<div>
+<div>
+<pre class="source">create database oak USING CODESET UTF-8 TERRITORY DEFAULT COLLATE USING IDENTITY;
+</pre></div></div>
+
+<p>To verify, check the INFO level log message written by <tt>RDBDocumentStore</tt> upon startup. For example:</p>
+
+<div>
+<div>
+<pre class="source">14:47:20.332 INFO  [main] RDBDocumentStore.java:1065        RDBDocumentStore (SNAPSHOT) instantiated for database DB2/NT64 SQL11014 (11.1), using driver: IBM Data Server Driver for JDBC and SQLJ 4.19.77 (4.19), connecting to: jdbc:db2://localhost:50276/OAK, properties: {DB2ADMIN.CODEPAGE=1208, DB2ADMIN.COLLATIONSCHEMA=SYSIBM, DB2ADMIN.COLLATIONNAME=IDENTITY}, transaction isolation level: TRANSACTION_READ_COMMITTED (2), DB2ADMIN.NODES: ID VARCHAR(512), MODIFIED BIGINT, HASBINARY SMALLINT, DELETEDONCE SMALLINT, MODCOUNT BIGINT, CMODCOUNT BIGINT, DSIZE BIGINT, VERSION SMALLINT, SDTYPE SMALLINT, SDMAXREVTIME BIGINT, DATA VARCHAR(16384), BDATA BLOB(1073741824) /* {BIGINT=-5, BLOB=2004, SMALLINT=5, VARCHAR=12} */ /* index DB2ADMIN.NODES_MOD on DB2ADMIN.NODES (MODIFIED ASC) other (#0, p0), unique index DB2ADMIN.NODES_PK on DB2ADMIN.NODES (ID ASC) clustered (#0, p0), index DB2ADMIN.NODES_SDM on DB2ADMIN.NODES (SDMAXREVTIME ASC) other (#0, p0), index DB2ADMIN.NODES_SDT on
  DB2ADMIN.NODES (SDTYPE ASC) other (#0, p0), index DB2ADMIN.NODES_VSN on DB2ADMIN.NODES (VERSION ASC) other (#0, p0) */
+</pre></div></div>
+</div>
+<div class="section">
+<h3><a name="MySQL"></a><a name="database-creation-mysql"></a> MySQL</h3>
+<p>Creating a database called <tt>OAK</tt>:</p>
+
+<div>
+<div>
+<pre class="source">create database oak DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
+</pre></div></div>
+
+<p>Also make sure to configure the <tt>max_allowed_packet</tt> parameter for the server (mysqld) to a value greater than 4M (such as 8388608).</p>
+<p>To verify, check the INFO level log message written by <tt>RDBDocumentStore</tt> upon startup. For example:</p>
+
+<div>
+<div>
+<pre class="source">13:40:46.637 INFO  [main] RDBDocumentStore.java:1065        RDBDocumentStore (SNAPSHOT) instantiated for database MySQL 8.0.15 (8.0), using driver: MySQL Connector/J mysql-connector-java-8.0.15 (Revision: 79a4336f140499bd22dd07f02b708e163844e3d5) (8.0), connecting to: jdbc:mysql://localhost:3306/oak?serverTimezone=UTC, properties: {character_set_database=utf8mb4, character_set_client=utf8mb4, character_set_connection=utf8mb4, character_set_results=, max_allowed_packet=8388608, collation_database=utf8mb4_unicode_ci, character_set_system=utf8, collation_server=utf8mb4_0900_ai_ci, collation=utf8mb4_unicode_ci, character_set_filesystem=binary, character_set_server=utf8mb4, collation_connection=utf8mb4_0900_ai_ci}, transaction isolation level: TRANSACTION_REPEATABLE_READ (4), .nodes: ID VARBINARY(512), MODIFIED BIGINT(20), HASBINARY SMALLINT(6), DELETEDONCE SMALLINT(6), MODCOUNT BIGINT(20), CMODCOUNT BIGINT(20), DSIZE BIGINT(20), VERSION SMALLINT(6), SDTYPE SMALLINT(6
 ), SDMAXREVTIME BIGINT(20), DATA VARCHAR(16000), BDATA LONGBLOB(2147483647) /* {BIGINT=-5, LONGBLOB=-4, SMALLINT=5, VARBINARY=-3, VARCHAR=12} */ /* unique index oak.PRIMARY on nodes (ID ASC) other (#0, p0), index oak.NODES_MOD on nodes (MODIFIED ASC) other (#0, p0), index oak.NODES_SDM on nodes (SDMAXREVTIME ASC) other (#0, p0), index oak.NODES_SDT on nodes (SDTYPE ASC) other (#0, p0), index oak.NODES_VSN on nodes (VERSION ASC) other (#0, p0) */
+</pre></div></div>
+</div>
+<div class="section">
+<h3><a name="Oracle"></a><a name="database-creation-oracle"></a> Oracle</h3>
+<p>Creating a database called <tt>OAK</tt>:</p>
+<p>(to be done)</p>
+<p>To verify, check the INFO level log message written by <tt>RDBDocumentStore</tt> upon startup. For example:</p>
+
+<div>
+<div>
+<pre class="source">13:26:37.073 INFO  [main] RDBDocumentStore.java:1067        RDBDocumentStore (SNAPSHOT) instantiated for database Oracle Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production (12.2), using driver: Oracle JDBC driver 12.2.0.1.0 (12.2), connecting to: jdbc:oracle:thin:@localhost:1521:orcl, properties: {NLS_CHARACTERSET=AL32UTF8, NLS_COMP=BINARY}, transaction isolation level: TRANSACTION_READ_COMMITTED (2), .: ID VARCHAR2(512), MODIFIED NUMBER, HASBINARY NUMBER, DELETEDONCE NUMBER, MODCOUNT NUMBER, CMODCOUNT NUMBER, DSIZE NUMBER, VERSION NUMBER, SDTYPE NUMBER, SDMAXREVTIME NUMBER, DATA VARCHAR2(4000), BDATA BLOB(-1) /* {BLOB=2004, NUMBER=2, VARCHAR2=12} */ /* index NODES_MOD on SYSTEM.NODES (MODIFIED) clustered (#0, p0), index NODES_SDM on SYSTEM.NODES (SDMAXREVTIME) clustered (#0, p0), index NODES_SDT on SYSTEM.NODES (SDTYPE) clustered (#0, p0), index NODES_VSN on SYSTEM.NODES (VERSION) clustered (#0, p0), unique index SYS_C008093 on SYSTEM.N
 ODES (ID) clustered (#0, p0) */
+</pre></div></div>
+</div>
+<div class="section">
+<h3><a name="PostgreSQL"></a><a name="database-creation-postgresql"></a> PostgreSQL</h3>
+<p>Creating a database called <tt>OAK</tt>:</p>
+
+<div>
+<div>
+<pre class="source">CREATE DATABASE &quot;oak&quot; TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'C' LC_CTYPE = 'C';
+</pre></div></div>
+
+<p>To verify, check the INFO level log message written by <tt>RDBDocumentStore</tt> upon startup. For example:</p>
+
+<div>
+<div>
+<pre class="source">16:26:28.172 INFO  [main] RDBDocumentStore.java:1065        RDBDocumentStore (SNAPSHOT) instantiated for database PostgreSQL 10.6 (10.6), using driver: PostgreSQL JDBC Driver 42.2.5 (42.2), connecting to: jdbc:postgresql:oak, properties: {datcollate=C, pg_encoding_to_char(encoding)=UTF8}, transaction isolation level: TRANSACTION_READ_COMMITTED (2), .nodes: id varchar(512), modified int8, hasbinary int2, deletedonce int2, modcount int8, cmodcount int8, dsize int8, version int2, sdtype int2, sdmaxrevtime int8, data varchar(16384), bdata bytea(2147483647) /* {bytea=-2, int2=5, int8=-5, varchar=12} */ /* index nodes_mod on public.nodes (modified ASC) other (#0, p1), unique index nodes_pkey on public.nodes (id ASC) other (#0, p1), index nodes_sdm on public.nodes (sdmaxrevtime ASC) other (#0, p1), index nodes_sdt on public.nodes (sdtype ASC) other (#0, p1), index nodes_vsn on public.nodes (version ASC) other (#0, p1) */
+</pre></div></div>
+</div>
+<div class="section">
+<h3><a name="SQL_Server"></a><a name="database-creation-sqlserver"></a> SQL Server</h3>
+<p>Creating a database called <tt>OAK</tt>:</p>
+
+<div>
+<div>
+<pre class="source">create database OAK;
+</pre></div></div>
+
+<p>To verify, check the INFO level log message written by <tt>RDBDocumentStore</tt> upon startup. For example:</p>
+
+<div>
+<div>
+<pre class="source">16:59:12.726 INFO  [main] RDBDocumentStore.java:1067        RDBDocumentStore (SNAPSHOT) instantiated for database Microsoft SQL Server 13.00.5081 (13.0), using driver: Microsoft JDBC Driver 7.2 for SQL Server 7.2.1.0 (7.2), connecting to: jdbc:sqlserver://localhost:1433;useBulkCopyForBatchInsert=false;cancelQueryTimeout=-1;sslProtocol=TLS;jaasConfigurationName=SQLJDBCDriver;statementPoolingCacheSize=0;serverPreparedStatementDiscardThreshold=10;enablePrepareOnFirstPreparedStatementCall=false;fips=false;socketTimeout=0;authentication=NotSpecified;authenticationScheme=nativeAuthentication;xopenStates=false;sendTimeAsDatetime=true;trustStoreType=JKS;trustServerCertificate=false;TransparentNetworkIPResolution=true;serverNameAsACE=false;sendStringParametersAsUnicode=true;selectMethod=direct;responseBuffering=adaptive;queryTimeout=-1;packetSize=8000;multiSubnetFailover=false;loginTimeout=15;lockTimeout=-1;lastUpdateCount=true;encrypt=false;disableStatementPooling=true;d
 atabaseName=OAK;columnEncryptionSetting=Disabled;applicationName=Microsoft JDBC Driver for SQL Server;applicationIntent=readwrite;, properties: {collation_name=Latin1_General_CI_AS}, transaction isolation level: TRANSACTION_READ_COMMITTED (2), .: ID varbinary(512), MODIFIED bigint, HASBINARY smallint, DELETEDONCE smallint, MODCOUNT bigint, CMODCOUNT bigint, DSIZE bigint, VERSION smallint, SDTYPE smallint, SDMAXREVTIME bigint, DATA nvarchar(4000), BDATA varbinary(2147483647) /* {bigint=-5, nvarchar=-9, smallint=5, varbinary=-3} */ /* index NODES.NODES_MOD on dbo.NODES (MODIFIED ASC) other (#0, p0), unique index NODES.NODES_PK on dbo.NODES (ID ASC) clustered (#0, p0), index NODES.NODES_SDM on dbo.NODES (SDMAXREVTIME ASC) other (#0, p0), index NODES.NODES_SDT on dbo.NODES (SDTYPE ASC) other (#0, p0), index NODES.NODES_VSN on dbo.NODES (VERSION ASC) other (#0, p0) */
+</pre></div></div>
+</div></div>
 <div class="section">
-<h2><a name="Table_Creation"></a>Table Creation</h2>
+<h2><a name="Table_Creation"></a><a name="table-creation"></a> Table Creation</h2>
 <p>The implementation will try to create all tables and indices when they are not present yet. Of course this requires that the configured database user actually has permission to do so. Example from system log:</p>
 
 <div>
@@ -303,9 +381,9 @@
 </pre></div></div>
 
 <p>If it does not, the system will not start up and provide diagnostics in the log file.</p>
-<p>Administrators who want to create tables upfront can do so. The DDL statements for the supported databases can be dumped using <a href="/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/document/rdb/RDBHelper.html">RDBHelper</a>.</p></div>
+<p>Administrators who want to create tables upfront can do so. The DDL statements for the supported databases can be dumped using <a href="/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/document/rdb/RDBHelper.html">RDBHelper</a> or, more recently, using <tt>oak-run rdbddldump</tt> (see <a href="#rdbddldump">below</a>).</p></div>
 <div class="section">
-<h2><a name="Upgrade_from_earlier_versions"></a>Upgrade from earlier versions</h2>
+<h2><a name="Upgrade_from_earlier_versions"></a><a name="upgrade"></a> Upgrade from earlier versions</h2>
 <p>As of Oak 1.8, the database layout has been slightly extended (see <a href="/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/document/rdb/RDBDocumentStore.html#apidocs.versioning">API docs for RDBDocumentStore</a> for details).</p>
 <p>Upon startup on an &#x201c;old&#x201d; database instance, <tt>RDBDocumentStore</tt> will try to upgrade the tables. Example (for <tt>NODES</tt>):</p>
 
@@ -332,6 +410,88 @@
 </pre></div></div>
 
 <p>The upgrade can then be done at a later point of time by executing the required DDL statements.</p></div>
+<div class="section">
+<h2><a name="oak-run_rdbddldump"></a><a name="rdbddldump"></a> oak-run rdbddldump</h2>
+<p><tt>@since Oak 1.8.12</tt> <tt>@since Oak 1.10.1</tt> <tt>@since Oak 1.12</tt></p>
+<p>The <tt>rdbddldump</tt> prints out the DDL statements that Oak would use to create or update a database. It can be used to create the tables upfront, or to obtain the DDL statements needed to upgrade to a newer schema version.</p>
+<p>By default, it will print out the DDL statements for all supported databases, with a target of the latest schema version.</p>
+<p>The <tt>--db</tt> switch can be used to specify the database type (note that precise spelling is needed, otherwise the code will fall back to a generic database type).</p>
+<p>The <tt>--initial</tt> switch selects the initial database schema (and defaults to the most recent one).</p>
+<p>The <tt>--upgrade</tt> switch selects the target database schema (and defaults to the most recent one).</p>
+<p>Selecting a higher &#x201c;upgrade&#x201d; version then the &#x201c;initial&#x201d; version causes the tool to create separate DDL statements for the initial table schema (which may already be there), and then to add individual statements for the upgrade to the target schema.</p>
+<p>For instance:</p>
+
+<div>
+<div>
+<pre class="source">java -jar oak-run-*.jar rdbddldump --db DB2 --initial 0 --upgrade 2
+</pre></div></div>
+
+<p>will dump statements for DB2, initially creating schema version 0 tables, and then include DDL statements to upgrade to version 2 (the latter would be applicable if an installation needed to be upgraded from an Oak version older than 1.8 to 1.8 or newer).</p>
+
+<div>
+<div>
+<pre class="source">-- DB2
+
+  -- creating table CLUSTERNODES for schema version 0
+  create table CLUSTERNODES (ID varchar(512) not null, MODIFIED bigint, HASBINARY smallint, DELETEDONCE smallint, MODCOUNT bigint, CMODCOUNT bigint, DSIZE bigint, DATA varchar(16384), BDATA blob(1073741824))
+  create unique index CLUSTERNODES_pk on CLUSTERNODES ( ID ) cluster
+  alter table CLUSTERNODES add constraint CLUSTERNODES_pk primary key ( ID )
+  create index CLUSTERNODES_MOD on CLUSTERNODES (MODIFIED)
+  -- upgrading table CLUSTERNODES to schema version 1
+  alter table CLUSTERNODES add VERSION smallint
+  -- upgrading table CLUSTERNODES to schema version 2
+  alter table CLUSTERNODES add SDTYPE smallint
+  alter table CLUSTERNODES add SDMAXREVTIME bigint
+  create index CLUSTERNODES_VSN on CLUSTERNODES (VERSION)
+  create index CLUSTERNODES_SDT on CLUSTERNODES (SDTYPE) exclude null keys
+  create index CLUSTERNODES_SDM on CLUSTERNODES (SDMAXREVTIME) exclude null keys
+
+  -- creating table JOURNAL for schema version 0
+  create table JOURNAL (ID varchar(512) not null, MODIFIED bigint, HASBINARY smallint, DELETEDONCE smallint, MODCOUNT bigint, CMODCOUNT bigint, DSIZE bigint, DATA varchar(16384), BDATA blob(1073741824))
+  create unique index JOURNAL_pk on JOURNAL ( ID ) cluster
+  alter table JOURNAL add constraint JOURNAL_pk primary key ( ID )
+  create index JOURNAL_MOD on JOURNAL (MODIFIED)
+  -- upgrading table JOURNAL to schema version 1
+  alter table JOURNAL add VERSION smallint
+  -- upgrading table JOURNAL to schema version 2
+  alter table JOURNAL add SDTYPE smallint
+  alter table JOURNAL add SDMAXREVTIME bigint
+  create index JOURNAL_VSN on JOURNAL (VERSION)
+  create index JOURNAL_SDT on JOURNAL (SDTYPE) exclude null keys
+  create index JOURNAL_SDM on JOURNAL (SDMAXREVTIME) exclude null keys
+
+  -- creating table NODES for schema version 0
+  create table NODES (ID varchar(512) not null, MODIFIED bigint, HASBINARY smallint, DELETEDONCE smallint, MODCOUNT bigint, CMODCOUNT bigint, DSIZE bigint, DATA varchar(16384), BDATA blob(1073741824))
+  create unique index NODES_pk on NODES ( ID ) cluster
+  alter table NODES add constraint NODES_pk primary key ( ID )
+  create index NODES_MOD on NODES (MODIFIED)
+  -- upgrading table NODES to schema version 1
+  alter table NODES add VERSION smallint
+  -- upgrading table NODES to schema version 2
+  alter table NODES add SDTYPE smallint
+  alter table NODES add SDMAXREVTIME bigint
+  create index NODES_VSN on NODES (VERSION)
+  create index NODES_SDT on NODES (SDTYPE) exclude null keys
+  create index NODES_SDM on NODES (SDMAXREVTIME) exclude null keys
+
+  -- creating table SETTINGS for schema version 0
+  create table SETTINGS (ID varchar(512) not null, MODIFIED bigint, HASBINARY smallint, DELETEDONCE smallint, MODCOUNT bigint, CMODCOUNT bigint, DSIZE bigint, DATA varchar(16384), BDATA blob(1073741824))
+  create unique index SETTINGS_pk on SETTINGS ( ID ) cluster
+  alter table SETTINGS add constraint SETTINGS_pk primary key ( ID )
+  create index SETTINGS_MOD on SETTINGS (MODIFIED)
+  -- upgrading table SETTINGS to schema version 1
+  alter table SETTINGS add VERSION smallint
+  -- upgrading table SETTINGS to schema version 2
+  alter table SETTINGS add SDTYPE smallint
+  alter table SETTINGS add SDMAXREVTIME bigint
+  create index SETTINGS_VSN on SETTINGS (VERSION)
+  create index SETTINGS_SDT on SETTINGS (SDTYPE) exclude null keys
+  create index SETTINGS_SDM on SETTINGS (SDMAXREVTIME) exclude null keys
+
+   -- creating blob store tables
+  create table DATASTORE_META (ID varchar(64) not null primary key, LVL int, LASTMOD bigint)
+  create table DATASTORE_DATA (ID varchar(64) not null primary key, DATA blob(2097152))
+</pre></div></div></div>
         </div>
       </div>
     </div>

Added: jackrabbit/site/live/oak/docs/nodestore/segment/onrc-memoirs.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/segment/onrc-memoirs.html?rev=1858009&view=auto
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/segment/onrc-memoirs.html (added)
+++ jackrabbit/site/live/oak/docs/nodestore/segment/onrc-memoirs.html Tue Apr 23 13:50:48 2019
@@ -0,0 +1,350 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2019-04-23 
+ | Rendered using Apache Maven Fluido Skin 1.6
+-->
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20190423" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Jackrabbit Oak &#x2013; Memoirs in Garbage Collection</title>
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.6.min.css" />
+    <link rel="stylesheet" href="../../css/site.css" />
+    <link rel="stylesheet" href="../../css/print.css" media="print" />
+      <script type="text/javascript" src="../../js/apache-maven-fluido-1.6.min.js"></script>
+      </head>
+    <body class="topBarEnabled">
+                  <a href="https://github.com/apache/jackrabbit-oak">
+      <img style="position: absolute; top: 0; right: 0; border: 0; z-index: 10000;"
+        src="https://s3.amazonaws.com/github/ribbons/forkme_right_red_aa0000.png"
+        alt="Fork me on GitHub">
+    </a>
+      <div id="topbar" class="navbar navbar-fixed-top ">
+      <div class="navbar-inner">
+        <div class="container-fluid">
+        <a data-target=".nav-collapse" data-toggle="collapse" class="btn btn-navbar">
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+        </a>
+<a class="brand" href="../../"  title="Oak logo"><img src="../../oak_logo.png" alt="Oak logo" />
+</a>
+            <ul class="nav">
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Overview <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li><a href="../../index.html" title="Jackrabbit Oak">Jackrabbit Oak</a></li>
+            <li><a href="../../license.html" title="License">License</a></li>
+            <li><a href="../../downloads.html" title="Downloads">Downloads</a></li>
+            <li><a href="../../articles.html" title="Articles">Articles</a></li>
+        </ul>
+      </li>
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Concepts and Architecture <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li><a href="../../architecture/overview.html" title="Overview">Overview</a></li>
+            <li><a href="../../architecture/nodestate.html" title="The Node State Model">The Node State Model</a></li>
+        </ul>
+      </li>
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Main APIs <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li><a href="http://www.day.com/specs/jcr/2.0/index.html" title="JCR API">JCR API</a></li>
+            <li><a href="https://jackrabbit.apache.org/jcr/jcr-api.html" title="Jackrabbit API">Jackrabbit API</a></li>
+            <li><a href="../../oak_api/overview.html" title="Oak API">Oak API</a></li>
+        </ul>
+      </li>
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Features and Plugins <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li class="dropdown-submenu">
+<a href="../../nodestore/overview.html" title="Node Storage">Node Storage</a>
+              <ul class="dropdown-menu">
+                  <li><a href="../../nodestore/documentmk.html" title="Document NodeStore">Document NodeStore</a></li>
+                  <li><a href="../../nodestore/segment/overview.html" title="Segment NodeStore">Segment NodeStore</a></li>
+                  <li><a href="../../nodestore/compositens.html" title="Composite NodeStore">Composite NodeStore</a></li>
+              </ul>
+            </li>
+            <li class="dropdown-submenu">
+<a href="../../plugins/blobstore.html" title="Blob Storage">Blob Storage</a>
+              <ul class="dropdown-menu">
+                  <li><a href="../../features/direct-binary-access.html" title="Direct Binary Access">Direct Binary Access</a></li>
+              </ul>
+            </li>
+            <li class="dropdown-submenu">
+<a href="../../query/query.html" title="Query">Query</a>
+              <ul class="dropdown-menu">
+                  <li><a href="../../query/query-engine.html" title="Query Engine">Query Engine</a></li>
+                  <li><a href="../../query/grammar-xpath.html" title="XPath Grammar">XPath Grammar</a></li>
+                  <li><a href="../../query/grammar-sql2.html" title="SQL-2 Grammar">SQL-2 Grammar</a></li>
+                  <li><a href="../../query/query-troubleshooting.html" title="Troubleshooting">Troubleshooting</a></li>
+                  <li><a href="../../query/indexing.html" title="Indexing">Indexing</a></li>
+                  <li><a href="../../query/lucene.html" title="Lucene Index">Lucene Index</a></li>
+                  <li><a href="../../query/property-index.html" title="Property Index">Property Index</a></li>
+                  <li><a href="../../query/solr.html" title="Solr Index">Solr Index</a></li>
+              </ul>
+            </li>
+            <li><a href="../../security/overview.html" title="Security">Security</a></li>
+            <li><a href="../../features/atomic-counter.html" title="Atomic Counter">Atomic Counter</a></li>
+            <li><a href="../../features/observation.html" title="Observation">Observation</a></li>
+        </ul>
+      </li>
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Using Oak <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li><a href="../../use_getting_started.html" title="Getting Started">Getting Started</a></li>
+            <li><a href="../../construct.html" title="Repository Construction">Repository Construction</a></li>
+            <li><a href="../../osgi_config.html" title="Configuring Oak">Configuring Oak</a></li>
+            <li><a href="../../command_line.html" title="Command Line Tools">Command Line Tools</a></li>
+            <li><a href="../../migration.html" title="Migration">Migration</a></li>
+            <li><a href="../../differences.html" title="Differences to Jackrabbit 2">Differences to Jackrabbit 2</a></li>
+            <li><a href="../../known_issues.html" title="Known Issues">Known Issues</a></li>
+            <li><a href="../../constraints.html" title="Constraints">Constraints</a></li>
+            <li><a href="../../dos_and_donts.html" title="Dos and Don'ts">Dos and Don'ts</a></li>
+            <li><a href="../../coldstandby/coldstandby.html" title="Cold Standby">Cold Standby</a></li>
+            <li><a href="../../FAQ.html" title="FAQ">FAQ</a></li>
+        </ul>
+      </li>
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Developing Oak <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li><a href="../../dev_getting_started.html" title="Getting Started">Getting Started</a></li>
+            <li><a href="../../participating.html" title="Participating">Participating</a></li>
+            <li><a href="../../developing-with-git.html" title="Developing with Git">Developing with Git</a></li>
+            <li><a href="../../diagnostic-builds.html" title="Cutting diagnostic builds">Cutting diagnostic builds</a></li>
+            <li><a href="../../branching.html" title="Branching off a new stable">Branching off a new stable</a></li>
+            <li><a href="../../attribution.html" title="Attribution">Attribution</a></li>
+            <li><a href="../../release-schedule.html" title="Release Schedule">Release Schedule</a></li>
+        </ul>
+      </li>
+        <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Links <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+            <li><a href="http://jackrabbit.apache.org/oak" title="Apache Jackrabbit Oak">Apache Jackrabbit Oak</a></li>
+            <li><a href="http://jackrabbit.apache.org/" title="Apache Jackrabbit">Apache Jackrabbit</a></li>
+        </ul>
+      </li>
+              </ul>
+            </div>
+        </div>
+      </div>
+    </div>
+    <div class="container-fluid">
+      <div id="banner">
+        <div class="pull-left"><div id="bannerLeft"><h2>Oak Documentation</h2>
+</div>
+</div>
+        <div class="pull-right"></div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+        <li id="publishDate">Last Published: 2019-04-23<span class="divider">|</span>
+</li>
+          <li id="projectVersion">Version: 1.14-SNAPSHOT</li>
+        </ul>
+      </div>
+      <div class="row-fluid">
+        <div id="leftColumn" class="span2">
+          <div class="well sidebar-nav">
+<ul class="nav nav-list">
+          <li class="nav-header">Overview</li>
+    <li><a href="../../index.html" title="Jackrabbit Oak"><span class="none"></span>Jackrabbit Oak</a>  </li>
+    <li><a href="../../license.html" title="License"><span class="none"></span>License</a>  </li>
+    <li><a href="../../downloads.html" title="Downloads"><span class="none"></span>Downloads</a>  </li>
+    <li><a href="../../articles.html" title="Articles"><span class="none"></span>Articles</a>  </li>
+          <li class="nav-header">Concepts and Architecture</li>
+    <li><a href="../../architecture/overview.html" title="Overview"><span class="none"></span>Overview</a>  </li>
+    <li><a href="../../architecture/nodestate.html" title="The Node State Model"><span class="none"></span>The Node State Model</a>  </li>
+          <li class="nav-header">Main APIs</li>
+    <li><a href="http://www.day.com/specs/jcr/2.0/index.html" class="externalLink" title="JCR API"><span class="none"></span>JCR API</a>  </li>
+    <li><a href="https://jackrabbit.apache.org/jcr/jcr-api.html" class="externalLink" title="Jackrabbit API"><span class="none"></span>Jackrabbit API</a>  </li>
+    <li><a href="../../oak_api/overview.html" title="Oak API"><span class="none"></span>Oak API</a>  </li>
+          <li class="nav-header">Features and Plugins</li>
+    <li><a href="../../nodestore/overview.html" title="Node Storage"><span class="icon-chevron-down"></span>Node Storage</a>
+      <ul class="nav nav-list">
+    <li><a href="../../nodestore/documentmk.html" title="Document NodeStore"><span class="icon-chevron-down"></span>Document NodeStore</a>
+      <ul class="nav nav-list">
+    <li><a href="../../nodestore/document/mongo-document-store.html" title="MongoDB DocumentStore"><span class="none"></span>MongoDB DocumentStore</a>  </li>
+    <li><a href="../../nodestore/document/rdb-document-store.html" title="RDB DocumentStore"><span class="none"></span>RDB DocumentStore</a>  </li>
+    <li><a href="../../nodestore/document/node-bundling.html" title="Node Bundling"><span class="none"></span>Node Bundling</a>  </li>
+    <li><a href="../../nodestore/document/secondary-store.html" title="Secondary Store"><span class="none"></span>Secondary Store</a>  </li>
+    <li><a href="../../nodestore/persistent-cache.html" title="Persistent Cache"><span class="none"></span>Persistent Cache</a>  </li>
+    <li><a href="../../clustering.html" title="Clustering"><span class="none"></span>Clustering</a>  </li>
+      </ul>
+  </li>
+    <li><a href="../../nodestore/segment/overview.html" title="Segment NodeStore"><span class="none"></span>Segment NodeStore</a>  </li>
+    <li><a href="../../nodestore/compositens.html" title="Composite NodeStore"><span class="none"></span>Composite NodeStore</a>  </li>
+      </ul>
+  </li>
+    <li><a href="../../plugins/blobstore.html" title="Blob Storage"><span class="icon-chevron-down"></span>Blob Storage</a>
+      <ul class="nav nav-list">
+    <li><a href="../../features/direct-binary-access.html" title="Direct Binary Access"><span class="none"></span>Direct Binary Access</a>  </li>
+      </ul>
+  </li>
+    <li><a href="../../query/query.html" title="Query"><span class="icon-chevron-down"></span>Query</a>
+      <ul class="nav nav-list">
+    <li><a href="../../query/query-engine.html" title="Query Engine"><span class="none"></span>Query Engine</a>  </li>
+    <li><a href="../../query/grammar-xpath.html" title="XPath Grammar"><span class="none"></span>XPath Grammar</a>  </li>
+    <li><a href="../../query/grammar-sql2.html" title="SQL-2 Grammar"><span class="none"></span>SQL-2 Grammar</a>  </li>
+    <li><a href="../../query/query-troubleshooting.html" title="Troubleshooting"><span class="none"></span>Troubleshooting</a>  </li>
+    <li><a href="../../query/indexing.html" title="Indexing"><span class="none"></span>Indexing</a>  </li>
+    <li><a href="../../query/lucene.html" title="Lucene Index"><span class="none"></span>Lucene Index</a>  </li>
+    <li><a href="../../query/property-index.html" title="Property Index"><span class="none"></span>Property Index</a>  </li>
+    <li><a href="../../query/solr.html" title="Solr Index"><span class="none"></span>Solr Index</a>  </li>
+      </ul>
+  </li>
+    <li><a href="../../security/overview.html" title="Security"><span class="none"></span>Security</a>  </li>
+    <li><a href="../../features/atomic-counter.html" title="Atomic Counter"><span class="none"></span>Atomic Counter</a>  </li>
+    <li><a href="../../features/observation.html" title="Observation"><span class="none"></span>Observation</a>  </li>
+          <li class="nav-header">Using Oak</li>
+    <li><a href="../../use_getting_started.html" title="Getting Started"><span class="none"></span>Getting Started</a>  </li>
+    <li><a href="../../construct.html" title="Repository Construction"><span class="none"></span>Repository Construction</a>  </li>
+    <li><a href="../../osgi_config.html" title="Configuring Oak"><span class="none"></span>Configuring Oak</a>  </li>
+    <li><a href="../../command_line.html" title="Command Line Tools"><span class="none"></span>Command Line Tools</a>  </li>
+    <li><a href="../../migration.html" title="Migration"><span class="none"></span>Migration</a>  </li>
+    <li><a href="../../differences.html" title="Differences to Jackrabbit 2"><span class="none"></span>Differences to Jackrabbit 2</a>  </li>
+    <li><a href="../../known_issues.html" title="Known Issues"><span class="none"></span>Known Issues</a>  </li>
+    <li><a href="../../constraints.html" title="Constraints"><span class="none"></span>Constraints</a>  </li>
+    <li><a href="../../dos_and_donts.html" title="Dos and Don'ts"><span class="none"></span>Dos and Don'ts</a>  </li>
+    <li><a href="../../coldstandby/coldstandby.html" title="Cold Standby"><span class="none"></span>Cold Standby</a>  </li>
+    <li><a href="../../FAQ.html" title="FAQ"><span class="none"></span>FAQ</a>  </li>
+          <li class="nav-header">Developing Oak</li>
+    <li><a href="../../dev_getting_started.html" title="Getting Started"><span class="none"></span>Getting Started</a>  </li>
+    <li><a href="../../participating.html" title="Participating"><span class="none"></span>Participating</a>  </li>
+    <li><a href="../../developing-with-git.html" title="Developing with Git"><span class="none"></span>Developing with Git</a>  </li>
+    <li><a href="../../diagnostic-builds.html" title="Cutting diagnostic builds"><span class="none"></span>Cutting diagnostic builds</a>  </li>
+    <li><a href="../../branching.html" title="Branching off a new stable"><span class="none"></span>Branching off a new stable</a>  </li>
+    <li><a href="../../attribution.html" title="Attribution"><span class="none"></span>Attribution</a>  </li>
+    <li><a href="../../release-schedule.html" title="Release Schedule"><span class="none"></span>Release Schedule</a>  </li>
+          <li class="nav-header">Links</li>
+    <li><a href="http://jackrabbit.apache.org/oak" class="externalLink" title="Apache Jackrabbit Oak"><span class="none"></span>Apache Jackrabbit Oak</a>  </li>
+    <li><a href="http://jackrabbit.apache.org/" class="externalLink" title="Apache Jackrabbit"><span class="none"></span>Apache Jackrabbit</a>  </li>
+  </ul>
+          <hr />
+          <div id="poweredBy">
+          <script type="text/javascript">asyncJs( 'https://apis.google.com/js/plusone.js' )</script>
+        <div class="g-plusone" data-href="http://jackrabbit.apache.org/oak/docs/" data-size="tall" ></div>
+                  <div class="clear"></div>
+              <div class="clear"></div>
+              <div class="clear"></div>
+              <div class="clear"></div>
+  <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a>
+              </div>
+          </div>
+        </div>
+        <div id="bodyColumn"  class="span10" >
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<h1>Memoirs in Garbage Collection</h1>
+<p>This is a brief outline of the history of Online Revision Garbage Collection in Oak. By linking to further details where necessary this historical context helps making sense of the various bits of information that are scattered across Jira Issues, Wikis, source code etc.</p>
+<p>Refer to <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/nodestore/segment/overview.html">Oak Segment Tar</a> on the Jackrabbit Oak Wiki for a general overview of the segment store, its design, data structures and inner workings.</p>
+<div class="section">
+<h2><a name="Background"></a>Background</h2>
+<p>Online Revision Cleanup (OnRC) refers to a technique employed by the segment store to reclaim disk space that is no longer in use. The implementation is structured in three phases:</p>
+<ul>
+
+<li>
+
+<p>Estimation: a heuristic to determine whether enough garbage has accumulated to warrant running garbage collection at all.</p>
+</li>
+<li>
+
+<p>Compaction: all records of the segment store&#x2019;s current head state are rewritten into a new, structurally equal head state. The records of the rewritten head state are compact within their segment as rewriting skips all records that are not reachable from the root node state.</p>
+</li>
+<li>
+
+<p>Cleanup: reclaimable segments are removed. Reclaimability is determined either by reachability through the segment graph or by the age of the segment depending on the version of Oak.</p>
+</li>
+</ul></div>
+<div class="section">
+<h2><a name="Oak_1.0_-_1.4"></a>Oak 1.0 - 1.4</h2>
+<p>Online Revision Garbage Collection did not work up to and including Oak 1.4 as it was not able to collect any garbage. In these version OnRC relied upon false premises on one hand and was further impacted by a bug <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3348">(OAK-3348)</a> on the other hand.</p>
+<p>The <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L1045">compaction</a> phase uses an instance of the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/Compactor.java#L54"><tt>Compactor</tt></a> class for <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L1061">rewriting</a> the current head state of the repository. The <tt>Compactor</tt> itself works by <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/Compactor.java#L160">comparing</a> a <tt>before</tt> state to an <tt>after</tt> state applying the differences to an <tt>onto</tt> state. In an initial
  pass the current head state of the repository is passed for the <tt>after</tt> state and an <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/memory/EmptyNodeState.java#L37">empty node</a> is passed for both <tt>before</tt> and <tt>onto</tt>. Once the initial phase completes an attempt is made to <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L1075">set</a> the repository&#x2019;s head state to the resulting node state via an atomic compare and set operation. This fails in the case when concurrent write operations to the repository changed its head state in the meanwhile. In that case a retry loop is entered where these additional changes are <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugi
 ns/segment/file/FileStore.java#L1082">compacted</a> on top of the previously compacted head state. After a configurable (default 5) numbers of retires a final attempt is made to <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L1099">force compact</a> any remaining changes while blocking writes to the repository for a configurable (default 1 minute) time. Only if force compacting also fails is the compaction considered failed.</p>
+<p>Once compaction finished, either successfully or not, <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L869"><tt>cleanup</tt></a> tries to remove content that is not reachable any more. However, the record graph of a repository grows very large very quickly. To avoid traversing large record graphs, the cleanup phase operates on the segment graph induced by the record graph. That is, for any two segments <tt>s1</tt> and <tt>s2</tt>, there is an edge from <tt>s1</tt> to <tt>s2</tt> if and only if <tt>s1</tt> contains a record that references a record in <tt>s2</tt>. By construction the segment graph contains far less vertices than the record graph. To speed up traversal it is pre-calculated and cached in the segment headers and in an graph entry of each tar file. While the segment graph is sufficiently small for efficient traversal, it is also extremely dense. In f
 act it turned out that its reflexive, transitive closure is the entire graph most of the time. The reason for this can be seen when looking at an example where a segment contains just a single reachable record and <tt>n</tt> unreachable records. In this case the single reachable record makes the segment reachable preventing it from being reclaimed along with the <tt>n</tt> non reachable records. To make matters worse, all segments referenced from this segment will also stay in the reachable set, although the single reachable record might not have any outgoing references at all.</p>
+<p>Traversing the segment graph starts with a set of <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L887">root segments</a>. These are all the segments that are currently references from the JVM (i.e. ultimately in use by some code either within Oak or within its client). From there the set of all referenced segments is determined for each <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/TarReader.java#L58"><tt>TarReader</tt></a> and each of them <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/TarReader.java#L752">cleaned up</a> individually. That is a new tar file containing only the still referenced segments is generated unless <a class="externalLink" href=
 "https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/TarReader.java#L789">space saving is not worth it</a>.</p>
+<p><a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-1828">OAK-1828</a> complicated the process for determining the segment graph as the improvements done with this issue inadvertently introduced the potential for cycles in the segment graph. The idea was to introduce a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L796">pool of <tt>SegmentBufferWriter</tt></a> instances to avoid contention on a single instance when concurrently writing to the segment store. However, depending on how writes are interleaved and scheduled to <tt>SegmentBufferWriter</tt> instances from the pool this opened up the possibility for <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3864">cycles in the segment graph</a>, which <tt>cleanup</tt> needs to take into <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-
 segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/file/FileStore.java#L900">consideration</a> when determining the the segment graph.</p>
+<p>Even though online compaction rewrites all records of the current head state into a new and compact representation of the repository&#x2019;s head in a separate set of segments, the presence of open sessions referencing older revisions for a while prevents those from being reclaimed. In addition <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3348">OAK-3348</a> could cause some of the segments of the compacted revision to still reference segments from the pre-compacted revision. Together with the segment graph being very dense this prevented almost any segment from being reclaimed in Oak 1.4 and earlier.</p>
+<p>See also the annotated slides <a class="externalLink" href="https://adapt.to/2016/en/schedule/into-the-tar-pit--a-tarmk-deep-dive.html">Into the tar pit: a TarMK deep dive</a> for further illustrations on this topic.</p></div>
+<div class="section">
+<h2><a name="Oak_1.6"></a>Oak 1.6</h2>
+<p>Oak 1.6 was the first version with workable OnRC overcoming the problems with with open sessions keeping references to previous revisions and fixing <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3348">(OAK-3348)</a>. This required changes in the persistence format forcing existing customers to <a class="externalLink" href="https://helpx.adobe.com/experience-manager/6-3/sites/deploying/using/revision-cleanup.html#OnlineRevisionCleanupFrequentlyAskedQuestions">migrate</a> their deployments.</p>
+<div class="section">
+<h3><a name="Generational_garbage_collection"></a>Generational garbage collection</h3>
+<p>Oak 1.6 changed the mechanism used to determine reclaimability of segments. Previous versions used reachability through the segment graph starting from a set of GC roots consisting of the segment containing the current head node state and all segments containing records currently referenced by the JVM (i.e. by open sessions).</p>
+<p>Oak 1.6 introduced the concept of a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Segment.java#L380">GC generation</a>. A GC generation is an integer starting at 0 and increasing with each run of OnRC. Each segment records the current GC generation from the time the segment was created in its <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentBufferWriter.java#L200-L203">segment header</a>. The current GC generation of the repository is just the GC generation of the segment containing the current head state. The compactor <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L845">reads</a> the current GC generation of the repo
 sitory and <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L864">rewrites</a> the head state using the next GC generation number for the segments created in the process. Once the compactor finished rewriting the current head state the newly created, compact head state is <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L876">atomically set</a> as the new head state of the repository, implicitly and atomically increasing the GC generation of the repository at the same time.</p>
+<p>In its default configuration the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L1055">cleanup</a> phase retains all segments from the current GC generation and the previous one reclaiming all older segments. With the default daily OnRC execution this results in a minimal segment retention time of 24 hours. Sessions that are open at the point in time where OnRC runs will automatically <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-2407">refresh</a> at next access to reduce the risk for them to reference content from segments that were reclaimed.</p>
+<p>Since <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/nodestore/segment/records.html#Bulk_segments">bulk segments</a> do not have a segment header and thus cannot record their GC generation, the cleanup phase still uses <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/TarReader.java#L754">reachability</a> through the segment graph to determine whether a bulk segment is reclaimable. That is, a bulk segment is reclaimable if and only if it is not reachable through the segment graph of the non reclaimable data segments starting from an initial set of <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L1076">root segment ids</a>.</p></div>
+<div class="section">
+<h3><a name="Preventing_references_between_segments_with_different_GC_generations"></a>Preventing references between segments with different GC generations</h3>
+<p>The generation based garbage collection approach disallows references between segments from different GC generations as otherwise reclaiming an older generation would render a newer one incomplete potentially causing <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentNotFoundException.java#L27"><tt>SegmentNotFoundException</tt></a>s subsequently. Unfortunately up to Oak 1.4 references between segments of different GC generations could be introduced by sessions that were acquired before an OnRC cycle completed. Such sessions would reference records in segments of the previous GC generations through their base state. When such a session subsequently saves its changes are written to segments of the new GC generation effectively creating references from this GC generation to the previous one. See <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3
 348">(OAK-3348)</a> for the full story.</p>
+<p>To prevent reference between segments of different GC generations such references need to be detected and the affected records rewritten into the current GC generation. That is, whenever a node state is written by a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L85"><tt>SegmentWriter</tt></a> all references to existing records are <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L1188">checked</a> and the records are rewritten if they do not refer to the current generation. Rewriting is potentially expensive as a base state of a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/state/NodeBuilder.java"><tt>NodeBuilder</tt>
 </a> might cover (a previous revision) of the whole repository. Expensive both in terms of CPU and IO cycles and in term of extra disk space to hold the rewritten base state. However when this situation occurs most records of that base state have likely already been rewritten by OnRC: the most recent compacted head state was rewritten by OnRC from a node state of a more recent revision than our base state, both of which are likely sharing many records. To avoid rewriting the same records multiple times the <tt>SegmentWriter</tt> employs <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L128">deduplication caches</a> for <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/WriterCacheManager.java#L108">node records</a>, <a class="externalLink" href="
 https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/WriterCacheManager.java#L94">string records</a> and <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/WriterCacheManager.java#L101">template records</a>.</p>
+<p>The deduplication caches are indexed by the GC generation such that records can be <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStoreBuilder.java#L441">evicted</a> from the caches by generation. Such an eviction happens for <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStoreBuilder.java#L448">older generations</a> that are not needed any more after <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L935">OnRC succeeded</a>. An eviction happens for the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segmen
 t/file/FileStoreBuilder.java#L457">generation created</a> by OnRC in the case of a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L834">failure</a>.</p>
+<p>For string records and template records the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/RecordCache.java">deduplication caches</a> are ultimately backed by a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/RecordCache.java#L171-L181">hash map</a> with an LRU eviction policy. For node records the situation is more complicated as determining structural equality for those means traversal of the potentially large subtree rooted at the record. Also relying on the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/RecordId.java"><tt>RecordId</tt></a> does not work as equality of <tt>RecordId</tt>s in only a necessary condition for structural equality o
 f node records but not a sufficient one. That is, when a node record is rewritten during compaction its clone will be structurally equal to the original but yet have a different <tt>RecordId</tt>. To overcome this problem a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentNodeState.java#L119">stable id</a> that would not change when rewriting a node state was introduced. This stable id is used by the <tt>SegmentWriter</tt> as <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L1157">cache key</a> when deduplicating node records. Finally the node deduplication cache is backed by the custom <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/
 segment/file/PriorityCache.java#L60">PriorityCache</a> implementation. This cache uses efficient rehashing into an array to resolve hash clashes while at the same time using a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/PriorityCache.java#L200-L230">cost measure</a> to avoid evicting expensive items. The <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L1004">cost</a> of a node increases with its number of child nodes, which increases the chance for such nodes to stay in the cache.</p>
+<p>To put everything together OnRC in Oak 1.6 uses a different approach for <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L983">compaction</a> than OffRC. While the latter uses the  <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java"><tt>Compactor</tt></a> class like in previous versions of Oak, the former passes the current head state to <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L343"><tt>SegmentWriter.writeNode</tt></a> for on a <tt>SegmentWriter</tt> for the next GC generation. This causes the head state to be rewritten into segments of the next GC generation filling the deduplication cach
 es in the process.</p></div></div>
+<div class="section">
+<h2><a name="Oak_1.8"></a>Oak 1.8</h2>
+<p>Oak 1.8 introduced two main improvements on top of the GC generation based garbage collection approach from Oak 1.6: sequential rebasing of checkpoints on top of each other and tail compaction, a lighter variant of compaction roughly comparable to the JVM&#x2019;s young generation garbage collection.</p>
+<div class="section">
+<h3><a name="Sequential_checkpoint_rebasing"></a>Sequential checkpoint rebasing</h3>
+<p>The segment store implements checkpoints as links to (previous) root node states from a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/scheduler/LockBasedScheduler.java#L383">child node</a> under the <tt>checkpoints</tt> node of the super root. In Oak 1.6 compaction was not concerned with checkpoints but rather treated them as regular nodes solely relying on the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/DefaultSegmentWriter.java#L967">node deduplication cache</a> to prevent them from being exploded. This approach did not scale well and could lead to <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-6984">high IO rates</a>.</p>
+<p>Oak 1.8 improved this aspect by considering checkpoints during compaction instead of relying on the node deduplication cache. This is done by using a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L56"><tt>Compactor</tt></a> to sequentially rebase <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/CheckpointCompactor.java#L119">checkpoints and the repository root</a> on <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/CheckpointCompactor.java#L160">top of each other</a> and subsequently reassembling them into the right <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/m
 ain/java/org/apache/jackrabbit/oak/segment/CheckpointCompactor.java#L128-L138">super-root structure</a>. As an additional optimisation already compacted checkpoints are <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/CheckpointCompactor.java#L249">cached</a> to prevent them from being passed to the compactor again in the common case where there are no changes between checkpoints.</p></div>
+<div class="section">
+<h3><a name="Tail_compaction"></a>Tail compaction</h3>
+<p>Full compaction of the whole repository is the most effective way to remove any accumulated garbage. To achieve this, full compaction rewrites all content regardless whether there have been updates causing fragmentation or not. As a result this approach is very resource intensive (mainly wrt. IO) and can take a lot of time to complete. Oak 1.8 introduced another compaction mode termed <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3349">tail compaction</a>. In contrast to full compaction, tail compaction does not compact the whole repository but only those revisions that have been updated since the last compaction.</p>
+<p>Tail compaction is implemented by compacting the current head state of the repository on top of the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L730">previously compacted head state</a>. The range of revisions starting from the previously compacted head state up to the current head state covers all updates since the last compaction, effectively making tail compaction cover only those revisions that have been updated since the last compaction. Tail compaction can also be regarded as a generalisation of full compaction. With the latter the current head state is <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L723">compacted on top of an empty node state</a>. With the former the current head state is <a class="externa
 lLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L730">compacted on top of a previously compacted head state</a>.</p>
+<p>Since tail compaction relies on the previously compacted head state the subsequent cleanup phase needs to take this into consideration. That is, cleanup needs to be aware of what type of compaction created a certain segment to determine its reclaimability. This required generalising the GC generation from a simple integer into a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/tar/GCGeneration.java#L52"><tt>GCGeneration</tt> class</a> that captures the concept of full and tail compaction. Instances of this class represent the garbage collection generation related information of a segment. It consists of the segment&#x2019;s generation, its full generation and its compaction flag. The segment&#x2019;s generation records the number of garbage collection cycles a segment went through and is incremented with every garbage collection regardless of its type. The segment&
 #x2019;s full generation records the number of full garbage collection cycles a segment went through. It is only incremented on full garbage collection cycles. The segment&#x2019;s compaction flag is set for those segments that have been created by a compaction operation. It is never set for segments created by normal write operations. Segments written by normal repository writes will inherit the generation and full generation of the segment written by the previous compaction process with the compacted flag cleared.</p>
+<p>The information recorded in this way allows to determine the reclaimability status of a segment by just looking at the <tt>GCGeneration</tt> instances of that segment and of the segment containing the repository head: Let <tt>S</tt> be a segment, <tt>H</tt> be the segment containing the current repository head and <tt>n</tt> be the number of retained generations:</p>
+<ul>
+
+<li><tt>S</tt> is old if and only if <tt>H.generation - S.generation &gt;= n</tt></li>
+<li><tt>S</tt> is in the same compaction tail like <tt>H</tt> if and only if <tt>S.isCompacted &amp;&amp; S.fullGeneration == H.fullGeneration</tt></li>
+<li><tt>S</tt> is reclaimable if and only if <tt>S</tt> is old and <tt>S</tt> is not in the same compaction tail like <tt>H</tt>.</li>
+</ul>
+<p>This logic is captured in the respective implementations of the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/Reclaimers.java#L61">reclaim predicate</a> in  the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/Reclaimers.java#L33"><tt>Reclaimers</tt> class</a>.</p>
+<blockquote>
+
+<p><i>Note:</i> Oak 1.8.0 had a <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7132">bug</a> in the implementation of the reclamation mechanism described. The bug was fixed with Oak 1.8.1, which is the version of Oak this section is referring to.</p>
+</blockquote></div>
+<div class="section">
+<h3><a name="The_Compactor_strikes_back"></a>The Compactor strikes back</h3>
+<p>Oak 1.6 removed the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/1.4/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/Compactor.java#L54"><tt>Compactor</tt> class</a> in favour of directly rewriting node states with the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.6.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L85"><tt>SegmentWriter</tt></a> solely relying on deduplication caches for deduplicating records. To implement sequential checkpoint rebasing and tail compaction Oak 1.8 reintroduced a new implementation of a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L56"><tt>Compactor</tt> class</a>. This implementation has to deal with two additional requirements compared to the previous implementation: tracking and assig
 ning stable ids and being able to cope with a large number of direct child nodes of a node. This is done by tracking changes with a <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L163"><tt>MemoryNodeBuilder</tt></a> instead of using a <tt>NodeBuilder</tt> acquired through calling <tt>NodeState.builder</tt>. The new <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentWriter.java#L130"><tt>SegmentWriter.write</tt></a> method with an extra argument for the stable is then used to <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L175">write</a> compacted node states including their stable id. In addition the number of upda
 tes to the <tt>MemoryNodeBuilder</tt> are <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L154-L160">tracked</a> and an <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L156">intermediate node is written</a> to avoid keeping to many updates in memory once an <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L62">update limit</a> is exceeded. Further updates are tracked in a fresh <tt>MemoryNodeBuilder</tt> instance that uses this <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.8.1/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Compactor.java#L157">
 intermediate node as its base</a>.</p></div></div>
+<div class="section">
+<h2><a name="Oak_1.10"></a>Oak 1.10</h2>
+<p>With Oak 1.6 and Oak 1.8 it was observed that running compaction first increases the repository until cleanup runs and subsequently removes the generation that has become reclaimable. Oak 1.10 improved this aspect by running cleanup <i>before</i> compaction thus levelling out the bump in repository size cause by the compaction phase.</p>
+<p>The effort included a few refactorings making garbage collection more modular: * <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7377">OAK-7377</a> generalised the <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.10.0/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/GarbageCollector.java#L46">garbage collector</a> allowing multiple implementations. * <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7440">OAK-7440</a>, <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7434">OAK-7434</a> and <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7436">OAK-7436</a> factored estimation, compaction and cleanup into independent components. * <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-7445">OAK-7445</a> introduced the new cleanup before compaction garbage collection strategy. * <a class="externalLink" href="htt
 ps://issues.apache.org/jira/browse/OAK-7550">OAK-7550</a> eventually set the cleanup before compaction strategy as the new default for Oak 1.10.</p></div>
+        </div>
+      </div>
+    </div>
+    <hr/>
+    <footer>
+      <div class="container-fluid">
+        <div class="row-fluid">
+            <p>Copyright &copy;2012&#x2013;2019
+<a href="https://www.apache.org/">The Apache Software Foundation</a>.
+All rights reserved.</p>
+        </div>
+                          <div id="ohloh" class="pull-right">
+      <script type="text/javascript" src="https://www.ohloh.net/p/jackrabbit-oak/widgets/project_thin_badge.js"></script>
+    </div>
+        </div>
+    </footer>
+    </body>
+</html>
\ No newline at end of file

Modified: jackrabbit/site/live/oak/docs/nodestore/segment/overview.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/nodestore/segment/overview.html?rev=1858009&r1=1858008&r2=1858009&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/nodestore/segment/overview.html (original)
+++ jackrabbit/site/live/oak/docs/nodestore/segment/overview.html Tue Apr 23 13:50:48 2019
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2019-01-08 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2019-04-23 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20190108" />
+    <meta name="Date-Revision-yyyymmdd" content="20190423" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Oak Segment Tar</title>
     <link rel="stylesheet" href="../../css/apache-maven-fluido-1.6.min.css" />
@@ -142,9 +142,9 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2019-01-08<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2019-04-23<span class="divider">|</span>
 </li>
-          <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
+          <li id="projectVersion">Version: 1.14-SNAPSHOT</li>
         </ul>
       </div>
       <div class="row-fluid">
@@ -169,6 +169,7 @@
     <li><a href="../../nodestore/documentmk.html" title="Document NodeStore"><span class="icon-chevron-down"></span>Document NodeStore</a>
       <ul class="nav nav-list">
     <li><a href="../../nodestore/document/mongo-document-store.html" title="MongoDB DocumentStore"><span class="none"></span>MongoDB DocumentStore</a>  </li>
+    <li><a href="../../nodestore/document/rdb-document-store.html" title="RDB DocumentStore"><span class="none"></span>RDB DocumentStore</a>  </li>
     <li><a href="../../nodestore/document/node-bundling.html" title="Node Bundling"><span class="none"></span>Node Bundling</a>  </li>
     <li><a href="../../nodestore/document/secondary-store.html" title="Secondary Store"><span class="none"></span>Secondary Store</a>  </li>
     <li><a href="../../nodestore/persistent-cache.html" title="Persistent Cache"><span class="none"></span>Persistent Cache</a>  </li>
@@ -308,7 +309,7 @@
 <p>See <a href="classes.html">Design of Oak Segment Tar</a> for a high level design overview of Oak Segment Tar.</p></div>
 <div class="section">
 <h2><a name="Garbage_Collection"></a><a name="garbage-collection"></a> Garbage Collection</h2>
-<p>Garbage Collection is the set of processes and techniques employed by Oak Segment Tar to eliminate unused persisted data, thus limiting the memory and disk footprint of the system. Most of the operations on repository data generate a certain amount of garbage. This garbage is a byproduct of the repository operations and consists of leftover data that is not usable by the user. If left unchecked, this garbage would just pile up, consume disk space and pollute in-memory data structures. To avoid this, Oak Segment Tar defines garbage collection procedures to eliminate unnecessary data.</p>
+<p>Garbage Collection is the set of processes and techniques employed by Oak Segment Tar to eliminate unused persisted data, thus limiting the memory and disk footprint of the system. Most of the operations on repository data generate a certain amount of garbage. This garbage is a byproduct of the repository operations and consists of leftover data that is not usable by the user. If left unchecked, this garbage would just pile up, consume disk space and pollute in-memory data structures. To avoid this, Oak Segment Tar defines garbage collection procedures to eliminate unnecessary data. The implementation of garbage collection in Oak evolved heavily between Oak 1.0 and Oak 1.8. See <a href="onrc-memoirs.html">Memoirs in Garbage Collection</a> for an historical account.</p>
 <div class="section">
 <h3><a name="Generational_Garbage_Collection"></a><a name="generational-garbage-collection"></a> Generational Garbage Collection</h3>
 <p>The process implemented by Oak Segment Tar to eliminate unnecessary data is a generational garbage collection algorithm. The idea behind this algorithm is that the system assigns a generation to every piece of data generated by the user. A generation is just a number that is monotonically increasing.</p>



Mime
View raw message