jackrabbit-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mreut...@apache.org
Subject svn commit: r1835390 [14/23] - in /jackrabbit/site/live/oak/docs: ./ architecture/ coldstandby/ features/ nodestore/ nodestore/document/ nodestore/segment/ oak-mongo-js/ oak_api/ plugins/ query/ security/ security/accesscontrol/ security/authentication...
Date Mon, 09 Jul 2018 08:53:19 GMT
Modified: jackrabbit/site/live/oak/docs/query/query-engine.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/query-engine.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/query/query-engine.html (original)
+++ jackrabbit/site/live/oak/docs/query/query-engine.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; The Query Engine</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -245,72 +245,52 @@
 <!--
 TOC created with some help:
 grep "^#.*$" src/site/markdown/query/query-engine.md | sed 's/#/    /g' | sed 's/\(#*\) \(.*\)/\1 * [\2](#\2)/g'
---><div class="section">
+-->
+<div class="section">
 <h2><a name="The_Query_Engine"></a>The Query Engine</h2>
-
 <ul>
-  
+
 <li><a href="#Overview">Overview</a>
-  
 <ul>
-    
+
 <li><a href="#Query_Processing">Query Processing</a>
-    
 <ul>
-      
+
 <li><a href="#Cost_Calculation">Cost Calculation</a></li>
-    </ul></li>
-    
+</ul>
+</li>
 <li><a href="#Query_Options">Query Options</a>
-    
 <ul>
-      
+
 <li><a href="#Query_Option_Traversal">Query Option Traversal</a></li>
-      
 <li><a href="#Query_Option_Index_Tag">Query Option Index Tag</a></li>
-    </ul></li>
-    
+</ul>
+</li>
 <li><a href="#Compatibility">Compatibility</a>
-    
 <ul>
-      
+
 <li><a href="#Result_Size">Result Size</a></li>
-      
 <li><a href="#Quoting">Quoting</a></li>
-      
 <li><a href="#Equality_for_Path_Constraints">Equality for Path Constraints</a></li>
-    </ul></li>
-    
+</ul>
+</li>
 <li><a href="#Slow_Queries_and_Read_Limits">Slow Queries and Read Limits</a></li>
-    
 <li><a href="#Full-Text_Queries">Full-Text Queries</a></li>
-    
 <li><a href="#Excerpts_and_Highlighting">Excerpts and Highlighting</a></li>
-    
 <li><a href="#Native_Queries">Native Queries</a></li>
-    
 <li><a href="#Similarity_Queries">Similarity Queries</a></li>
-    
 <li><a href="#Spellchecking">Spellchecking</a></li>
-    
 <li><a href="#Suggestions">Suggestions</a></li>
-    
 <li><a href="#Facets">Facets</a></li>
-    
 <li><a href="#XPath_to_SQL-2_Transformation">XPath to SQL-2 Transformation</a></li>
-    
 <li><a href="#The_Node_Type_Index">The Node Type Index</a></li>
-    
 <li><a href="#Temporarily_Disabling_an_Index">Temporarily Disabling an Index</a></li>
-    
 <li><a href="#The_Deprecated_Ordered_Index">The Deprecated Ordered Index</a></li>
-    
 <li><a href="#Index_Storage_and_Manual_Inspection">Index Storage and Manual Inspection</a></li>
-    
 <li><a href="#SQL-2_Optimisation">SQL-2 Optimisation</a></li>
-    
 <li><a href="#Additional_XPath_and_SQL-2_Features">Additional XPath and SQL-2 Features</a></li>
-  </ul></li>
+</ul>
+</li>
 </ul></div>
 <div class="section">
 <h2><a name="Overview"></a>Overview</h2>
@@ -320,26 +300,24 @@ grep "^#.*$" src/site/markdown/query/que
 <h3><a name="Query_Processing"></a>Query Processing</h3>
 <p>Internally, the query engine uses a cost based query optimizer that asks all the available query indexes for the estimated cost to process the query. It then uses the index with the lowest cost.</p>
 <p>By default, the following indexes are available:</p>
-
 <ul>
-  
+
 <li>A property index for each indexed property.</li>
-  
 <li>A full-text index which is based on Apache Lucene / Solr.</li>
-  
-<li>A node type index (which is based on an property index for the properties  jcr:primaryType and jcr:mixins).</li>
-  
+<li>A node type index (which is based on an property index for the properties jcr:primaryType and jcr:mixins).</li>
 <li>A traversal index that iterates over a subtree.</li>
 </ul>
 <p>If no index can efficiently process the filter condition, the nodes in the repository are traversed at the given subtree.</p>
 <p>Usually, data is read from the index and repository while traversing over the query result. There are exceptions however, where all data is read in memory when the query is executed. The most common case is when using an <tt>order by</tt> clause and the index can not provide a sorted result. There are other cases where paths of the results read so far are kept in memory, in order to not return duplicate results. This is the case when using <tt>or</tt> conditions such that two indexes are used (internally a <tt>union</tt> query is executed).</p>
 <p>If you enable debug logging for the module <tt>org.apache.jackrabbit.oak.query</tt>, you may see this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">cost for nodeType is 1638354.0
+<div>
+<div>
+<pre class="source">cost for nodeType is 1638354.0
 cost for property is 2.0
 cost for traverse is 3451100.0
 </pre></div></div>
+
 <p>This means the cost for the nodetype index <i>would</i> be about 1638354.0, the cost for the property index <i>would</i> be about 2, and the cost for traversal <i>would</i> be about 3451100.0. An index that can&#x2019;t deal with a certain condition will return the cost &#x201c;Infinity&#x201d;. It doesn&#x2019;t say traversal is actually used, it just lists the expected costs. The query engine will then pick the index with the lowest expected cost, which is (in the case above) &#x201c;property&#x201d;.</p>
 <p>The expected cost for traversal is, with Oak 1.0.x, really just a guess looking at the length of the path. With Oak 1.2 and newer, the &#x201c;counter&#x201d; index is used (see mainly OAK-1907). There is an known issue with this, if you add and remove a lot of nodes in a loop, you could end up with a too-low cost, see OAK-4065.</p>
 <div class="section">
@@ -354,112 +332,125 @@ cost for traverse is 3451100.0
 <h4><a name="Query_Option_Traversal"></a>Query Option Traversal</h4>
 <p>By default, queries without index will log an info level message as follows (see OAK-4888, since Oak 1.6):</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">Traversal query (query without index): {statement}; consider creating an index
+<div>
+<div>
+<pre class="source">Traversal query (query without index): {statement}; consider creating an index
 </pre></div></div>
+
 <p>This message is only logged if no index is available, and if the query potentially traverses many nodes. No message is logged if an index is available, but traversing is cheap.</p>
 <p>By setting the JMX configuration <tt>QueryEngineSettings.failTraversal</tt> to true, queries without index throw an exception instead of just logging a message.</p>
-<p>In the query itself, the syntax <tt>option(traversal {ok|fail|warn})</tt> is supported (at the very end of the statement, after <tt>order by</tt>). This is to override the default setting, for queries that traverse a well known number of nodes (for example 10 or 20 nodes). This is supported for both XPath and SQL-2, as follows: </p>
+<p>In the query itself, the syntax <tt>option(traversal {ok|fail|warn})</tt> is supported (at the very end of the statement, after <tt>order by</tt>). This is to override the default setting, for queries that traverse a well known number of nodes (for example 10 or 20 nodes). This is supported for both XPath and SQL-2, as follows:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/oak:index/*[@type='lucene'] option(traversal ok)
+<div>
+<div>
+<pre class="source">/jcr:root/oak:index/*[@type='lucene'] option(traversal ok)
 
 select * from [nt:base] 
 where ischildnode('/oak:index') 
 order by name()
 option(traversal ok)
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="Query_Option_Index_Tag"></a>Query Option Index Tag</h4>
 <p><tt>@since Oak 1.7.4 (OAK-937)</tt></p>
 <p>By default, queries will use the index with the lowest expected cost (as in relational databases). But in rare cases, it is needed to specify which index(es) should be considered for a query:</p>
-
 <ul>
-  
-<li>If there are multiple Lucene fulltext indexes with different aggregation rules,  and the index data overlaps.  In the query, you want to specify which aggregation rule to use.</li>
-  
-<li>To temporarily work around limitations of the index implementation,  for example incorrect cost estimation.</li>
+
+<li>If there are multiple Lucene fulltext indexes with different aggregation rules, and the index data overlaps. In the query, you want to specify which aggregation rule to use.</li>
+<li>To temporarily work around limitations of the index implementation, for example incorrect cost estimation.</li>
 </ul>
 <p>Using index tags should be the exception, and should only be used temporarily. To use index tags, add <tt>tags</tt> (a multi-valued String property) to the index(es) of choice, for example:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/oak:index/lucene/tags = [x, y]
+<div>
+<div>
+<pre class="source">/oak:index/lucene/tags = [x, y]
 </pre></div></div>
+
 <p>Note each index can have multiple tags, and the same tag can be used in multiple indexes. The syntax to limit a query to a certain tag is: <tt>&lt;query&gt; option(index tag &lt;tagName&gt;)</tt>:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content//element(*, nt:file)[jcr:contains(., 'test')] option(index tag x)
+<div>
+<div>
+<pre class="source">/jcr:root/content//element(*, nt:file)[jcr:contains(., 'test')] option(index tag x)
 
 select * from [nt:file] 
 where ischildnode('/content')
 and contains(*, 'test')
 option(index tag [x])
 </pre></div></div>
+
 <p>The query will only consider the indexes that contain the specified tag (that is, possibly multiple indexes). Each query supports one tag only. The tag name may only contain the characters <tt>a-z, A-Z, 0-9, _</tt>.</p>
 <p>Limitations:</p>
-
 <ul>
-  
+
 <li>This is currently supported in indexes of type <tt>lucene</tt> compatVersion 2, and type <tt>property</tt>.</li>
-  
-<li>For indexes of type <tt>lucene</tt>, when adding adding or changing the property <tt>tags</tt>,  you need to also set the property <tt>refresh</tt> to <tt>true</tt> (Boolean),  so that the change is applied. No indexing is required.</li>
-  
-<li>Solr indexes, and the reference index, don&#x2019;t support tags yet.  That means they still might return a low cost, even if the tag does not match.</li>
-  
-<li>The nodetype index only partially supports this feature: if a tag is specified in the query, then the nodetype index  is not used. However, tags in the nodetype index itself are ignored currently.</li>
-  
-<li>There is currently no way to disable traversal that way.  So if the expected cost of traversal is very low, the query will traverse.  Note that traversal is never used for fulltext queries.</li>
+<li>For indexes of type <tt>lucene</tt>, when adding adding or changing the property <tt>tags</tt>, you need to also set the property <tt>refresh</tt> to <tt>true</tt> (Boolean), so that the change is applied. No indexing is required.</li>
+<li>Solr indexes, and the reference index, don&#x2019;t support tags yet. That means they still might return a low cost, even if the tag does not match.</li>
+<li>The nodetype index only partially supports this feature: if a tag is specified in the query, then the nodetype index is not used. However, tags in the nodetype index itself are ignored currently.</li>
+<li>There is currently no way to disable traversal that way. So if the expected cost of traversal is very low, the query will traverse. Note that traversal is never used for fulltext queries.</li>
 </ul></div></div>
 <div class="section">
 <h3><a name="Compatibility"></a>Compatibility</h3>
 <div class="section">
 <h4><a name="Result_Size"></a>Result Size</h4>
 <p>For NodeIterator.getSize(), some versions of Jackrabbit 2.x returned the estimated (raw) Lucene result set size, including nodes that are not accessible.</p>
-<p>By default, Oak does not do this; it either returns the correct result size, or -1. Oak 1.2.x and newer supports a compatibility flag so that it works in a similar way as Jackrabbit 2.x, by returning an estimate (see OAK-2926). Specially, only query restrictions that are part of the used index are considered when calculating the size. Additionally, ACLs are not applied to the results, so nodes which are not visible to the current session will still be included in the count returned. As such, the count returned can be higher than the actual number of results and the accurate count can only be determined by iterating through the results.</p>
+<p>By default, Oak does not do this; it either returns the correct result size, or -1. Oak 1.2.x and newer supports a compatibility flag so that it works in a similar way as Jackrabbit 2.x, by returning an estimate (see OAK-2926). Specially, only query restrictions that are part of the used index are considered when calculating the size. Additionally, ACLs are not applied to the results, so nodes which are not visible to the current session will still be included in the count returned. As such,  the count returned can be higher than the actual number of results and the accurate count can only be determined by iterating through the results.</p>
 <p>This only works with the Lucene <tt>compatVersion=2</tt> right now, so even if enabled, getSize may still return -1 if the index used does not support the feature. Example code to show how this work (where <tt>test</tt> is a common word in the index):</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">String query = &quot;//element(*, cq:Page)[jcr:contains(., 'test')]&quot;;
+<div>
+<div>
+<pre class="source">String query = &quot;//element(*, cq:Page)[jcr:contains(., 'test')]&quot;;
 Query query = queryManager.createQuery(qs, &quot;xpath&quot;);
 QueryResult result = query.execute();
 long size = result.getRows().getSize();
 </pre></div></div>
+
 <p>This is best configured via OSGi configuration (since Oak 1.6.x), or as described in OAK-2977, since Oak 1.3.x: When using Apache Sling, add the following line to the file <tt>conf/sling.properties</tt>, and then restart the application:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">oak.query.fastResultSize=true
-</pre></div></div></div>
+<div>
+<div>
+<pre class="source">oak.query.fastResultSize=true
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="Quoting"></a>Quoting</h4>
 <p><a class="externalLink" href="https://wiki.apache.org/jackrabbit/EncodingAndEscaping">Special characters in queries need to be escaped.</a></p>
 <p>However, compared to Jackrabbit 2.x, the query parser is now generally more strict about invalid syntax. The following query used to work in Jackrabbit 2.x, but not in Oak, because multiple way to quote the path are used at the same time:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">SELECT * FROM [nt:base] AS s 
+<div>
+<div>
+<pre class="source">SELECT * FROM [nt:base] AS s 
 WHERE ISDESCENDANTNODE(s, [&quot;/libs/sling/config&quot;])
 </pre></div></div>
+
 <p>Instead, the query now needs to be:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">SELECT * FROM [nt:base] AS s 
+<div>
+<div>
+<pre class="source">SELECT * FROM [nt:base] AS s 
 WHERE ISDESCENDANTNODE(s, [/libs/sling/config])
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="Equality_for_Path_Constraints"></a>Equality for Path Constraints</h4>
 <p>In Jackrabbit 2.x, the following condition was interpreted as a LIKE condition:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">SELECT * FROM nt:base WHERE jcr:path = '/abc/%'
+<div>
+<div>
+<pre class="source">SELECT * FROM nt:base WHERE jcr:path = '/abc/%'
 </pre></div></div>
+
 <p>Therefore, the query behaves exactly the same as if LIKE was used. In Oak, this is no longer the case, and such queries search for an exact path match.</p></div></div>
 <div class="section">
 <h3><a name="Slow_Queries_and_Read_Limits"></a>Slow Queries and Read Limits</h3>
 <p>Slow queries are logged as follows:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">*WARN* Traversed 10000 nodes with filter Filter(query=select ...)
+<div>
+<div>
+<pre class="source">*WARN* Traversed 10000 nodes with filter Filter(query=select ...)
 consider creating an index or changing the query
 </pre></div></div>
+
 <p>If this is the case, an index might need to be created, or the condition of the query might need to be changed to take advantage of an existing index.</p>
 <p>Queries that traverse many nodes, or that read many nodes in memory, can be cancelled. The limits can be set at runtime (also while a slow query is running) using JMX, domain &#x201c;org.apache.jackrabbit.oak&#x201d;, type &#x201c;QueryEngineSettings&#x201d;, attribute names &#x201c;LimitInMemory&#x201d; and &#x201c;LimitReads&#x201d;. These setting are not persisted, so in the next restart, the default values (unlimited) are used. As a workaround, these limits can be changed using the system properties &#x201c;oak.queryLimitInMemory&#x201d; and &#x201c;oak.queryLimitReads&#x201d;. Queries that exceed one of the limits are cancelled with an UnsupportedOperationException saying that &#x201c;The query read more than x nodes&#x2026; To avoid running out of memory, processing was stopped.&#x201d;</p>
 <p>&#x201c;LimitReads&#x201d; applies to the number of nodes read by a query. It applies whether or not an index is used. As an example, if a query has just two conditions, as in <tt>a=1 and b=2</tt>, and if there is an index on <tt>a</tt>, then all nodes with <tt>a=1</tt> need to be read while traversing the result. If more nodes are read than the set limit, then an exception is thrown. If the query also has a path condition (for example descendants of <tt>/home</tt>), and if the index supports path conditions (which is the case for all property indexes, and also for Lucene indexes if <tt>evaluatePathRestrictions</tt> is set), then only nodes in the given subtree are read.</p>
@@ -470,8 +461,9 @@ consider creating an index or changing t
 <p>By default (that is, using a Lucene index with <tt>compatVersion</tt> 2), Jackrabbit Oak uses the <a class="externalLink" href="https://lucene.apache.org/core/4_7_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html">Apache Lucene grammar for fulltext search</a>. <a class="externalLink" href="https://wiki.apache.org/jackrabbit/EncodingAndEscaping">See also how to escape queries.</a></p>
 <p>For older Lucene indexes (<tt>compatVersion</tt> 1), the following syntax is supported within <tt>contains</tt> queries. This is a subset of the Apache Lucene syntax:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">FullTextSearch ::= Or
+<div>
+<div>
+<pre class="source">FullTextSearch ::= Or
 Or ::= And { ' OR ' And }* 
 And ::= Term { ' ' Term }*
 Term ::= ['-'] { SimpleTerm | PhraseTerm } [ '^' Boost ]
@@ -479,13 +471,16 @@ SimpleTerm ::= Word
 PhraseTerm ::= '&quot;' Word { ' ' Word }* '&quot;'
 Boost ::= &lt;number&gt;
 </pre></div></div>
+
 <p>Please note that <tt>OR</tt> needs to be written in uppercase. Characters within words can be escaped using a backslash.</p>
 <p>Examples:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">jcr:contains(., 'jelly sandwich^4')
+<div>
+<div>
+<pre class="source">jcr:contains(., 'jelly sandwich^4')
 jcr:contains(@jcr:title, 'find this')
 </pre></div></div>
+
 <p>In the first example, the word &#x201c;sandwich&#x201d; has weight four times more than the word &#x201c;jelly.&#x201d; For details of boosting, see the Apache Lucene documentation about Score Boosting.</p>
 <p>For compatibility with Jackrabbit 2.x, single quoted phrase queries are currently supported. That means the query <tt>contains(., &quot;word ''hello world'' word&quot;)</tt> is supported. New applications should not rely on this feature.</p></div>
 <div class="section">
@@ -493,36 +488,45 @@ jcr:contains(@jcr:title, 'find this')
 <p>The Lucene index can be configured to provide excerpts and highlighting. See <a href="lucene.html#Property_Definitions">useInExcerpt</a> for details on how to configure excerpt generation.</p>
 <p>For queries to use those excerpts, the query needs to use the Lucene index where this is configured. The queries also needs to contain the &#x201c;excerpt&#x201d; property, as follows:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(.))
+<div>
+<div>
+<pre class="source">/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(.))
 </pre></div></div>
+
 <p>The excerpt is then read using the JCR API call <tt>row.getValue(&quot;rep:excerpt(.)&quot;)</tt>.</p>
 <p>Since Oak version 1.10 (OAK-7151), optionally a property name can be specified in the query:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(@jcr:title) | rep:excerpt(.))
+<div>
+<div>
+<pre class="source">/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(@jcr:title) | rep:excerpt(.))
 </pre></div></div>
+
 <p>The excerpt for the title is then read using <tt>row.getValue(&quot;rep:excerpt(@title)&quot;)</tt>, and the excerpt for the node using (as before) <tt>row.getValue(&quot;rep:excerpt(.)&quot;)</tt>.</p>
 <div class="section">
 <h4><a name="SimpleExcerptProvider"></a>SimpleExcerptProvider</h4>
 <p>The SimpleExcerptProvider is a fallback mechanism for excerpts and highlighting. This mechanism has limitations, and should only be used if really needed. The SimpleExcerptProvider is independent of the index configuration. Highlighting is limited, for example stopwords are ignored. Highlighting is case insensitive since Oak versions 1.2.30, 1.4.22, 1.6.12, 1.8.3, and 1.10 (OAK-7437).</p>
 <p>The SimpleExcerptProvider is used when reading an excerpt if the query doesn&#x2019;t contain an excerpt property, as in:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content//*[jcr:contains(., 'test')]
+<div>
+<div>
+<pre class="source">/jcr:root/content//*[jcr:contains(., 'test')]
 </pre></div></div>
+
 <p>The SimpleExcerptProvider is also used if an excerpt is requested for a property that is not specified in the query. For example, when using <tt>row.getValue(&quot;rep:excerpt(@title)&quot;)</tt>, but the query does not contain this property as an excerpt property, as in:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(.))
+<div>
+<div>
+<pre class="source">/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(.))
 </pre></div></div>
+
 <p>The SimpleExcerptProvider is also used for queries that don&#x2019;t use a Lucene index, or if the query uses a Lucene index, but excerpts are not configured there.</p></div></div>
 <div class="section">
 <h3><a name="Native_Queries"></a>Native Queries</h3>
 <p>To take advantage of features that are available in full-text index implementations such as Apache Lucene and Apache Lucene Solr, so called <tt>native</tt> constraints are supported. Such constraints are passed directly to the full-text index. This is supported for both XPath and SQL-2. For XPath queries, the name of the function is <tt>rep:native</tt>, and for SQL-2, it is <tt>native</tt>. The first parameter is the index type (currently supported are <tt>solr</tt> and <tt>lucene</tt>). The second parameter is the native search query expression. For SQL-2, the selector name (if needed) is the first parameter, just before the language. Examples:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">//*[rep:native('solr', 'name:(Hello OR World)')]
+<div>
+<div>
+<pre class="source">//*[rep:native('solr', 'name:(Hello OR World)')]
 
 select [jcr:path] from [nt:base] 
 where native('solr', 'name:(Hello OR World)')
@@ -530,35 +534,43 @@ where native('solr', 'name:(Hello OR Wor
 select [jcr:path] from [nt:base] as a 
 where native(a, 'solr', 'name:(Hello OR World)')
 </pre></div></div>
+
 <p>This also allows to use the Solr <a class="externalLink" href="http://wiki.apache.org/solr/MoreLikeThis">MoreLikeThis</a> feature. An example query is:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">select [jcr:path] from [nt:base] 
+<div>
+<div>
+<pre class="source">select [jcr:path] from [nt:base] 
 where native('solr', 'mlt?q=id:UTF8TEST&amp;mlt.fl=manu,cat&amp;mlt.mindf=1&amp;mlt.mintf=1')
 </pre></div></div>
+
 <p>If no full-text implementation is available, those queries will fail.</p></div>
 <div class="section">
 <h3><a name="Similarity_Queries"></a>Similarity Queries</h3>
 <p>Oak supports similarity queries when using the Lucene or Solr indexes. For example, the following query will return nodes that have similar content than the node /test/a:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">//element(*, nt:base)[rep:similar(., '/test/a')]
+<div>
+<div>
+<pre class="source">//element(*, nt:base)[rep:similar(., '/test/a')]
 </pre></div></div>
+
 <p>Compared to Jackrabbit 2.x, support for rep:similar has the following limitations: Full-text aggregation is not currently supported.</p></div>
 <div class="section">
 <h3><a name="Spellchecking"></a>Spellchecking</h3>
 <p><tt>@since Oak 1.1.17, 1.0.13</tt></p>
 <p>Oak supports spellcheck queries when using the Lucene or Solr indexes. Unlike most queries, spellcheck queries won&#x2019;t return a JCR <tt>Node</tt> as the outcome of such queries will be text terms that come from content as written into JCR <tt>properties</tt>. For example, the following query will return spellchecks for the (wrongly spelled) term <tt>helo</tt>:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root[rep:spellcheck('helo')]/(rep:spellcheck())
+<div>
+<div>
+<pre class="source">/jcr:root[rep:spellcheck('helo')]/(rep:spellcheck())
 </pre></div></div>
+
 <p>The result of such a query will be a JCR <tt>Row</tt> which will contain the corrected terms, as spellchecked by the used underlying index, in a special property named <tt>rep:spellcheck()</tt>.</p>
 <p>Clients wanting to obtain spellchecks could use the following JCR code:</p>
 <p><tt>@until Oak 1.3.10, 1.2.13</tt> spellchecks are returned flat.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">QueryManager qm = ...;
+<div>
+<div>
+<pre class="source">QueryManager qm = ...;
 String xpath = &quot;/jcr:root[rep:spellcheck('helo')]/(rep:spellcheck())&quot;;
 QueryResult result = qm.createQuery(xpath, Query.XPATH).execute();
 RowIterator it = result.getRows();
@@ -567,15 +579,19 @@ if (it.hasNext()) {
     spellchecks = row.getValue(&quot;rep:spellcheck()&quot;).getString()        
 }
 </pre></div></div>
+
 <p>The <tt>spellchecks</tt> String would be have the following pattern <tt>\[[\w|\W]+(\,\s[\w|\W]+)*\]</tt>, e.g.:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">[hello, hold]
+<div>
+<div>
+<pre class="source">[hello, hold]
 </pre></div></div>
+
 <p><tt>@since Oak 1.3.11, 1.2.14</tt> each spellcheck would be returned per row.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">QueryManager qm = ...;
+<div>
+<div>
+<pre class="source">QueryManager qm = ...;
 String xpath = &quot;/jcr:root[rep:spellcheck('helo')]/(rep:spellcheck())&quot;;
 QueryResult result = qm.createQuery(xpath, Query.XPATH).execute();
 RowIterator it = result.getRows();
@@ -584,6 +600,7 @@ while (it.hasNext()) {
     spellchecks.add(row.getValue(&quot;rep:spellcheck()&quot;).getString());        
 }
 </pre></div></div>
+
 <p>If either Lucene or Solr were configured to provide the spellcheck feature, see <a href="lucene.html#Spellchecking">Enable spellchecking in Lucene</a> and <a href="solr.html#Spellchecking">Enable spellchecking in Solr</a></p>
 <p>Note that spellcheck terms come already filtered according to calling user privileges, so that users could see spellcheck corrections only coming from indexed content they are allowed to read.</p></div>
 <div class="section">
@@ -591,15 +608,18 @@ while (it.hasNext()) {
 <p><tt>@since Oak 1.1.17, 1.0.15</tt></p>
 <p>Oak supports search suggestions when using the Lucene or Solr indexes. Unlike most queries, suggest queries won&#x2019;t return a JCR <tt>Node</tt> as the outcome of such queries will be text terms that come from content as written into JCR <tt>properties</tt>. For example, the following query will return search suggestions for the (e.g. user entered) term <tt>in</tt>:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root[rep:suggest('in ')]/(rep:suggest())
+<div>
+<div>
+<pre class="source">/jcr:root[rep:suggest('in ')]/(rep:suggest())
 </pre></div></div>
+
 <p>The result of such a query will be a JCR <tt>Row</tt> which will contain the suggested terms, together with their score, as suggested and scored by the used underlying index, in a special property named <tt>rep:suggest()</tt>.</p>
 <p>Clients wanting to obtain suggestions could use the following JCR code:</p>
 <p><tt>@until Oak 1.3.10, 1.2.13</tt> suggestions are returned flat.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">QueryManager qm = ...;
+<div>
+<div>
+<pre class="source">QueryManager qm = ...;
 String xpath = &quot;/jcr:root[rep:suggest('in ')]/(rep:suggest())&quot;;
 QueryResult result = qm.createQuery(xpath, Query.XPATH).execute();
 RowIterator it = result.getRows();
@@ -608,16 +628,20 @@ if (it.hasNext()) {
     suggestions = row.getValue(&quot;rep:suggest()&quot;).getString()        
 }
 </pre></div></div>
+
 <p>The <tt>suggestions</tt> String would be have the following pattern <tt>\[\{(term\=)[\w|\W]+(\,weight\=)\d+\}(\,\{(term\=)[\w|\W]+(\,weight\=)\d+\})*\]</tt>, e.g.:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">[{term=in 2015 a red fox is still a fox,weight=1.5}, {term=in 2015 my fox is red, 
+<div>
+<div>
+<pre class="source">[{term=in 2015 a red fox is still a fox,weight=1.5}, {term=in 2015 my fox is red, 
 like mike's fox and john's fox,weight=0.7}]
 </pre></div></div>
+
 <p><tt>@since Oak 1.3.11, 1.2.14</tt> each suggestion would be returned per row.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">QueryManager qm = ...;
+<div>
+<div>
+<pre class="source">QueryManager qm = ...;
 String xpath = &quot;/jcr:root[rep:suggest('in ')]/(rep:suggest())&quot;;
 QueryResult result = qm.createQuery(xpath, Query.XPATH).execute();
 RowIterator it = result.getRows();
@@ -626,13 +650,15 @@ while (it.hasNext()) {
     suggestions.add(row.getValue(&quot;rep:suggest()&quot;).getString());        
 }
 </pre></div></div>
+
 <p>If either Lucene or Solr were configured to provide the suggestions feature, see <a href="lucene.html#Suggestions">Enable suggestions in Lucene</a> and <a href="solr.html#Suggestions">Enable suggestions in Solr</a>. Note that suggested terms come already filtered according to calling user privileges, so that users could see suggested terms only coming from indexed content they are allowed to read.</p></div>
 <div class="section">
 <h3><a name="Facets"></a>Facets</h3>
 <p><tt>@since Oak 1.3.14</tt> Oak has support for <a class="externalLink" href="https://en.wikipedia.org/wiki/Faceted_search">facets</a>. Once enabled (see details for <a href="lucene.html#Facets">Lucene</a> and/or <a href="solr.html#Suggestions">Solr</a> indexes) facets can be retrieved on properties (backed by a proper field in Lucene / Solr) using the following snippet:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">String sql2 = &quot;select [jcr:path], [rep:facet(tags)] from [nt:base] &quot; +
+<div>
+<div>
+<pre class="source">String sql2 = &quot;select [jcr:path], [rep:facet(tags)] from [nt:base] &quot; +
                 &quot;where contains([jcr:title], 'oak')&quot;);
 Query q = qm.createQuery(sql2, Query.JCR_SQL2);
 QueryResult result = q.execute();
@@ -645,14 +671,16 @@ for (FacetResult.Facet facet : facets) {
     ...
 }
 </pre></div></div>
+
 <p>Nodes/Rows can still be retrieved from within the QueryResult object the usual way.</p></div>
 <div class="section">
 <h3><a name="XPath_to_SQL-2_Transformation"></a>XPath to SQL-2 Transformation</h3>
-<p>To support the XPath query language, such queries are internally converted to SQL-2. </p>
+<p>To support the XPath query language, such queries are internally converted to SQL-2.</p>
 <p>Every conversion is logged in <tt>debug</tt> level under the <tt>org.apache.jackrabbit.oak.query.QueryEngineImpl</tt> logger:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">org.apache.jackrabbit.oak.query.QueryEngineImpl Parsing xpath statement: 
+<div>
+<div>
+<pre class="source">org.apache.jackrabbit.oak.query.QueryEngineImpl Parsing xpath statement: 
     //element(*)[@sling:resourceType = 'slingevent:Lock')]
 org.apache.jackrabbit.oak.query.QueryEngineImpl XPath &gt; SQL2: 
     select [jcr:path], [jcr:score], * from [nt:base] as a 
@@ -660,6 +688,7 @@ org.apache.jackrabbit.oak.query.QueryEng
     /* xpath: //element(*)[@sling:resourceType = 'slingevent:Lock' 
     and @lock.created &lt; xs:dateTime('2013-09-02T15:44:05.920+02:00')] */
 </pre></div></div>
+
 <p><i>Each transformed SQL-2 query contains the original XPath query as a comment.</i></p>
 <p>When converting from XPath to SQL-2, <tt>or</tt> conditions are automatically converted to <tt>union</tt> queries, so that indexes can be used for conditions of the form <tt>a = 'x' or b = 'y'</tt>.</p></div>
 <div class="section">
@@ -680,26 +709,32 @@ org.apache.jackrabbit.oak.query.QueryEng
 <div class="section">
 <h3><a name="SQL-2_Optimisation"></a>SQL-2 Optimisation</h3>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">@since 1.3.9 with -Doak.query.sql2optimisation
+<div>
+<div>
+<pre class="source">@since 1.3.9 with -Doak.query.sql2optimisation
 </pre></div></div>
+
 <p>Enabled by default in 1.3.11 it will perform a round of optimisation on the <tt>Query</tt> object obtained after parsing a SQL-2 statement. It will for example attempt a conversion of OR conditions into UNION <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-1617">OAK-1617</a>.</p>
 <p>To disable it provide <tt>-Doak.query.sql2optimisation=false</tt> at the start-up.</p></div>
 <div class="section">
 <h3><a name="Additional_XPath_and_SQL-2_Features"></a>Additional XPath and SQL-2 Features</h3>
 <p>The Oak implementation supports some features that are not part of the JCR specification:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">@since 1.5.12
+<div>
+<div>
+<pre class="source">@since 1.5.12
 </pre></div></div>
+
 <p>Union for XPath and SQL-2 queries. Examples:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/(content|lib)/*
+<div>
+<div>
+<pre class="source">/jcr:root/(content|lib)/*
 /jcr:root/content//*[@a] | /jcr:root/lib//*[@b]) order by @c
 select * from [nt:base] as a where issamenode(a, '/content') 
 union select * from [nt:base] as a where issamenode(a, '/lib')
 </pre></div></div>
+
 <p>XPath functions &#x201c;fn:string-length&#x201d; and &#x201c;fn:local-name&#x201d;.</p></div></div>
         </div>
       </div>

Modified: jackrabbit/site/live/oak/docs/query/query-troubleshooting.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/query-troubleshooting.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/query/query-troubleshooting.html (original)
+++ jackrabbit/site/live/oak/docs/query/query-troubleshooting.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Query Troubleshooting</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -241,83 +241,98 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><div class="section">
+  -->
+<div class="section">
 <h2><a name="Query_Troubleshooting"></a>Query Troubleshooting</h2>
 <div class="section">
 <h3><a name="Slow_Queries"></a>Slow Queries</h3>
 <p>The first step in query troubleshooting is often to detect a query is slow, or traverses many nodes. Queries that traverse many nodes are logged as follows:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">*WARN* org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor 
+<div>
+<div>
+<pre class="source">*WARN* org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor 
     Traversed 22000 nodes with filter Filter(query=
     select * from [nt:base] where isdescendantnode('/etc') and lower([jcr:title]) like '%coat%');
     consider creating an index or changing the query
 </pre></div></div>
+
 <p>To get good performance, queries should not traverse more than about 1000 nodes (specially for queries that are run often).</p>
 <div class="section">
 <h4><a name="Potentially_Slow_Queries"></a>Potentially Slow Queries</h4>
 <p>In addition to avoiding queries that traverse many nodes, it makes sense to avoid queries that don&#x2019;t use an index. Such queries might be fast (and only traverse few nodes) with a small repository, but with a large repository they are typically slow as well. Therefore, it makes sense to detect such queries as soon as possible (in a developer environment), even before the code that runs those queries is tested with a larger repository. Oak will detect such queries and log them as follows (with log level INFO for Oak 1.6.x, and WARN for Oak 1.8.x):</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">*INFO* org.apache.jackrabbit.oak.query.QueryImpl Traversal query (query without index): 
+<div>
+<div>
+<pre class="source">*INFO* org.apache.jackrabbit.oak.query.QueryImpl Traversal query (query without index): 
     select * from [nt:base] where isdescendantnode('/etc') and lower([jcr:title]) like '%coat%'; 
     consider creating an index
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="Query_Plan"></a>Query Plan</h4>
 <p>To understand why the query is slow, the first step is commonly to get the query execution plan. To do this, the query can be executed using <tt>explain select ...</tt>. For the above case, the plan is:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">[nt:base] as [nt:base] /* traverse &quot;/etc//*&quot; 
+<div>
+<div>
+<pre class="source">[nt:base] as [nt:base] /* traverse &quot;/etc//*&quot; 
 where (isdescendantnode([nt:base], [/etc])) and (lower([nt:base].[jcr:title]) like '%coat%') */
 </pre></div></div>
+
 <p>That means, all nodes below <tt>/etc</tt> are traversed.</p></div>
 <div class="section">
 <h4><a name="Making_the_Query_More_Specific"></a>Making the Query More Specific</h4>
 <p>In order to make the query faster, try to add more constraints, or make constraints tighter. This will usually require some knowledge about the expected results. For example, if the path restriction is more specific, then less nodes need to be read. This is also true if an index is used. Also, if possible use a more specific node type. To understand if a nodetype or mixin is indexed, consult the nodetype index at <tt>/oak:index/nodetype</tt>, property <tt>declaringNodeTypes</tt>. But even if this is not the case, the nodetype should be as specific as possible. Assuming the query is changed to this:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">select * from [acme:Product] 
+<div>
+<div>
+<pre class="source">select * from [acme:Product] 
 where isdescendantnode('/etc/commerce') 
 and lower([jcr:title]) like '%coat%')
 and [commerceType] = 'product'
 </pre></div></div>
+
 <p>The only <i>relevant</i> change was to improve the path restriction. But in this case, it already was enough to make the traversal warning go away.</p></div>
 <div class="section">
 <h4><a name="Queries_Without_Index"></a>Queries Without Index</h4>
 <p>After changing the query, there is still a message in the log file that complains the query doesn&#x2019;t use an index, as described above:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">*INFO* org.apache.jackrabbit.oak.query.QueryImpl 
+<div>
+<div>
+<pre class="source">*INFO* org.apache.jackrabbit.oak.query.QueryImpl 
     Traversal query (query without index): 
     select * from [acme:Product] where isdescendantnode('/etc/commerce') 
     and lower([jcr:title]) like '%coat%'
     and [commerceType] = 'product'; consider creating an index
 </pre></div></div>
+
 <p>The query plan of the index didn&#x2019;t change, so still nodes are traversed. In this case, there are relatively few nodes because it&#x2019;s an almost empty development repository, so no traversal warning is logged. But for production, there might be a lot more nodes under <tt>/etc/commerce</tt>, so it makes sense to continue optimization.</p></div>
 <div class="section">
 <h4><a name="Where_Traversal_is_OK"></a>Where Traversal is OK</h4>
 <p>If it is known from the data model that a query will never traverse many nodes, then no index is needed. This is a corner case, and only applies to queries that traverse a fixed number of (for example) configuration nodes, or if the number of descendant nodes is guaranteed to be very low by using a certain nodetype that only allows for a fixed number of child nodes. If this is the case, then the query can be changed to say traversal is fine. To mark such queries, append <tt>option(traversal ok)</tt> to the query. This feature should only be used for those rare corner cases.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">select * from [nt:base] 
+<div>
+<div>
+<pre class="source">select * from [nt:base] 
 where isdescendantnode('/etc/commerce') 
 and lower([jcr:title]) like '%coat%'
 and [commerceType] = 'product'
 option(traversal ok)
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="aEstimating_Node_Counts"></a>&#xa0;Estimating Node Counts</h4>
 <p>To find out how many nodes are in a certain path, you can use the JMX bean <tt>NodeCounter</tt>, which can estimate the node count. Example: run <tt>getEstimatedChildNodeCounts</tt> with <tt>p1=/</tt> and <tt>p2=2</tt> might give you:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/: 2522208,
+<div>
+<div>
+<pre class="source">/: 2522208,
 ...
 /etc: 1521504,
 /etc/commerce: 29216,
 /etc/images: 1231232,
 ...
 </pre></div></div>
+
 <p>So in this case, there are still many nodes below <tt>/etc/commerce</tt> in the production repository. Also note that the number of nodes can grow over time.</p></div>
 <div class="section">
 <h4><a name="aPrevent_Running_Traversal_Queries"></a>&#xa0;Prevent Running Traversal Queries</h4>
@@ -325,37 +340,31 @@ option(traversal ok)
 <div class="section">
 <h4><a name="Using_a_Different_or_New_Index"></a>Using a Different or New Index</h4>
 <p>There are multiple options:</p>
-
 <ul>
-  
-<li>Consider creating an index for <tt>jcr:title</tt>. But for <tt>like '%..%'</tt> conditions,  this is not of much help, because all nodes with that property will need to be read.  Also, using <tt>lower</tt> will make the index less effective.  So, this only makes sense if there are very few nodes with this property  expected to be in the system.</li>
-  
-<li>If there are very few nodes with that nodetype,  consider adding <tt>acme:Product</tt> to the nodetype index. This requires reindexing.  The query could then use the nodetype index, and within this nodetype,  just traverse below <tt>/etc/commerce</tt>.  The <tt>NodeCounter</tt> can also help understand how many <tt>acme:Product</tt>  nodes are in the repository, if this nodetype is indexed.  To find out, run <tt>getEstimatedChildNodeCounts</tt> with  <tt>p1=/oak:index/nodetype</tt> and <tt>p2=2</tt>.</li>
-  
-<li>If the query needs to return added nodes immediately (synchronously; that is without delay),  consider creating a <a href="./property-index.html">property index</a>.  Note that Lucene indexes are asynchronous, and new nodes may not  appear in the result for a few seconds.</li>
-  
-<li>To ensure there is only one node matching the result in the repository,  consider creating a unique <a href="./property-index.html">property index</a>.</li>
-  
-<li>Consider using a fulltext index, that is: change the query from using  <tt>lower([jcr:title]) like '%...%'</tt> to using <tt>contains([jcr:title], '...')</tt>.  Possibly combine this with adding the property  <tt>commerceType</tt> to the fulltext index.</li>
+
+<li>Consider creating an index for <tt>jcr:title</tt>. But for <tt>like '%..%'</tt> conditions, this is not of much help, because all nodes with that property will need to be read. Also, using <tt>lower</tt> will make the index less effective. So, this only makes sense if there are very few nodes with this property expected to be in the system.</li>
+<li>If there are very few nodes with that nodetype, consider adding <tt>acme:Product</tt> to the nodetype index. This requires reindexing. The query could then use the nodetype index, and within this nodetype, just traverse below <tt>/etc/commerce</tt>. The <tt>NodeCounter</tt> can also help understand how many <tt>acme:Product</tt> nodes are in the repository, if this nodetype is indexed. To find out, run <tt>getEstimatedChildNodeCounts</tt> with <tt>p1=/oak:index/nodetype</tt> and <tt>p2=2</tt>.</li>
+<li>If the query needs to return added nodes immediately (synchronously; that is without delay), consider creating a <a href="./property-index.html">property index</a>. Note that Lucene indexes are asynchronous, and new nodes may not appear in the result for a few seconds.</li>
+<li>To ensure there is only one node matching the result in the repository, consider creating a unique <a href="./property-index.html">property index</a>.</li>
+<li>Consider using a fulltext index, that is: change the query from using <tt>lower([jcr:title]) like '%...%'</tt> to using <tt>contains([jcr:title], '...')</tt>. Possibly combine this with adding the property <tt>commerceType</tt> to the fulltext index.</li>
 </ul>
 <p>The last plan is possibly the best solution for this case.</p></div>
 <div class="section">
 <h4><a name="Index_Definition_Generator"></a>Index Definition Generator</h4>
 <p>In case you need to modify or create a Lucene property index, you can use the <a class="externalLink" href="http://oakutils.appspot.com/generate/index">Oak Index Definition Generator</a> tool.</p>
 <p>As the tool doesn&#x2019;t know your index configuration, it will always suggest to create a new index; it might be better to extend an existing index. However, note that:</p>
-
 <ul>
-  
+
 <li>Changing an existing index requires reindexing that index.</li>
-  
-<li>If an out-of-the-box index is modified, you will need to merge those modifications  when migrating to newer software.  It is best to add documentation to the index definition to simplify merging,  for example in the form of &#x201c;info&#x201d; properties.</li>
+<li>If an out-of-the-box index is modified, you will need to merge those modifications when migrating to newer software. It is best to add documentation to the index definition to simplify merging, for example in the form of &#x201c;info&#x201d; properties.</li>
 </ul></div>
 <div class="section">
 <h4><a name="Verification"></a>Verification</h4>
 <p>After changing the query, and possibly the index, run the <tt>explain select</tt> again, and verify the right plan is used, in this case that might be, for the query:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">select * from [acme:Product] 
+<div>
+<div>
+<pre class="source">select * from [acme:Product] 
 where isdescendantnode('/etc/commerce') 
 and contains([jcr:title], 'Coat')
 and [commerceType] = 'product'
@@ -363,13 +372,15 @@ and [commerceType] = 'product'
 [nt:unstructured] as [acme:Product] /* lucene:lucene(/oak:index/lucene) 
 full:jcr:title:coat ft:(jcr:title:&quot;Coat&quot;)
 </pre></div></div>
+
 <p>So in this case, only the fulltext restriction of the query was used by the index, but this might already be sufficient. If it is not, then the fulltext index might be changed to also index <tt>commerceType</tt>, or possibly to use <tt>evaluatePathRestrictions</tt>.</p></div>
 <div class="section">
 <h4><a name="Queries_With_Many_OR_or_UNION_Conditions"></a>Queries With Many OR or UNION Conditions</h4>
 <p>Queries that contain many &#x201c;or&#x201d; conditions, or with many &#x201c;union&#x201d; subqueries, can be slow as they have to read a lot of data. Example query:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content/(a|b|c|d|e)//element(*, cq:Page)[
+<div>
+<div>
+<pre class="source">/jcr:root/content/(a|b|c|d|e)//element(*, cq:Page)[
 jcr:contains(@jcr:title, 'some text') 
 or jcr:contains(jcr:content/@keywords, 'some text')
 or jcr:contains(jcr:content/@cq:tags, 'some text')
@@ -377,36 +388,46 @@ or jcr:contains(jcr:content/@team, 'some
 or jcr:contains(jcr:content/@topics, 'some text')
 or jcr:contains(jcr:content/@jcr:description, 'some text')]
 </pre></div></div>
+
 <p>This query will be internally converted into 5 subqueries, due to the &#x201c;union&#x201d; clause (a|b|c|d|e). Then, each of the 5 subqueries will run 6 subqueries: one for each jcr:contains condition. So, the index will be contacted 30 times.</p>
 <p>To avoid this overhead, the index could be changed (or a new index created) to do aggregation on the required properties (here: jcr:title, jcr:content/keywords,&#x2026;). This will simplify the query to:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content/(a|b|c|d|e)//element(*, cq:Page)[jcr:contains(., 'some text')]
+<div>
+<div>
+<pre class="source">/jcr:root/content/(a|b|c|d|e)//element(*, cq:Page)[jcr:contains(., 'some text')]
 </pre></div></div>
+
 <p>This should resolve most problems. To further speed up the query by avoiding to running 5 subqueries, it might be better to use a less specific path constraint, but instead use a different way to filter results, such as:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content//element(*, cq:Page)[jcr:contains(., 'some text') and @category='x']
-</pre></div></div></div>
+<div>
+<div>
+<pre class="source">/jcr:root/content//element(*, cq:Page)[jcr:contains(., 'some text') and @category='x']
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="Ordering_by_Score_Combined_With_OR__UNION_Conditions"></a>Ordering by Score Combined With OR / UNION Conditions</h4>
 <p>Queries that expect results to be sorted by score (&#x201c;order by @jcr:score descending&#x201d;), and use &#x201c;union&#x201d; or &#x201c;or&#x201d; conditions, may not return the result in the expected order, depending on the index(es) used. Example:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/conent/products/(indoor|outdoor)//*[jcr:contains(., 'test')] 
+<div>
+<div>
+<pre class="source">/jcr:root/conent/products/(indoor|outdoor)//*[jcr:contains(., 'test')] 
 order by @jcr:score descending
 </pre></div></div>
+
 <p>Here, the query is converted to a &#x201c;union&#x201d;, and the result of both subqueries is combined. If the score for each subquery is not comparable (which is often the case for Lucene indexes), then the order of the results may not match the expected order. Instead of using path restrictions as above, it is most likely better to use a an additional condition in the query, and index that:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/jcr:root/content/products//*[jcr:contains(., 'test') and 
+<div>
+<div>
+<pre class="source">/jcr:root/content/products//*[jcr:contains(., 'test') and 
 (@productTag='indoor' or @productTag='outdoor')] 
 order by @jcr:score descending
 </pre></div></div>
+
 <p>If this is not possible, then try to avoid using &#x201c;union&#x201d;, and use an &#x201c;or&#x201d; condition as follows. This will only work for SQL-2 queries however:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">select * from [nt:base] as a where contains(*, 'test') and issamenode(a, '/content') and 
+<div>
+<div>
+<pre class="source">select * from [nt:base] as a where contains(*, 'test') and issamenode(a, '/content') and 
 ([jcr:path] like '/content/x800/%' or [jcr:path] like '/content/y900/%') 
 order by [jcr:score] desc
 </pre></div></div></div></div></div>

Modified: jackrabbit/site/live/oak/docs/query/query.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/query.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/query/query.html (original)
+++ jackrabbit/site/live/oak/docs/query/query.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Oak Query</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -241,49 +241,38 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><div class="section">
+  -->
+<div class="section">
 <h2><a name="Oak_Query"></a>Oak Query</h2>
 <p>Oak does not index as much content by default as does Jackrabbit 2. You need to create custom indexes when necessary, much like in traditional RDBMSs. If there is no index for a specific query, then the repository will be traversed. That is, the query will still work but probably be very slow.</p>
-
 <ul>
-  
+
 <li><a href="./query-engine.html">The Query Engine</a></li>
-  
 <li><a href="./grammar-xpath.html">XPath Grammar</a></li>
-  
 <li><a href="./query-sql2.html">SQL-2 Grammar</a></li>
-  
 <li><a href="./query-troubleshooting.html">Troubleshooting</a></li>
-  
 <li><a href="./flags.html">Flags</a></li>
 </ul>
 <div class="section">
 <h3><a name="Indexes"></a>Indexes</h3>
 <p>There are 3 main types of indexes available in Oak. For other type (eg: nodetype) please refer to the <a href="./query-engine.html">query engine</a> documentation page.</p>
-
 <ul>
-  
+
 <li><a href="./lucene.html">Lucene</a></li>
-  
 <li><a href="./solr.html">Solr</a></li>
-  
 <li><a href="./property-index.html">Property</a></li>
 </ul>
 <p>For more details on how indexing works (for all index types):</p>
-
 <ul>
-  
+
 <li><a href="indexing.html">Indexing</a></li>
-  
 <li><a href="indexing.html#Reindexing">Reindexing</a></li>
 </ul></div>
 <div class="section">
 <h3><a name="Customisations"></a>Customisations</h3>
-
 <ul>
-  
+
 <li><a href="./ootb-index-change.html">Change Out-Of-The-Box Index Definitions</a></li>
-  
 <li><a href="./search-mt.html">Machine Translation for Search</a></li>
 </ul></div></div>
         </div>

Modified: jackrabbit/site/live/oak/docs/query/search-mt.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/search-mt.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/query/search-mt.html (original)
+++ jackrabbit/site/live/oak/docs/query/search-mt.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Machine Translation for Search</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -240,24 +240,23 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><div class="section">
+  -->
+<div class="section">
 <h2><a name="Machine_Translation_for_Search"></a>Machine Translation for Search</h2>
-
 <ul>
-  
+
 <li><a href="#qtmtl">Query time MT for Lucene indexes</a>
-  
 <ul>
-    
+
 <li><a href="#joshua">Apache Joshua</a>
-    
 <ul>
-      
+
 <li><a href="#languagepacks">Language Packs</a></li>
-    </ul></li>
-    
+</ul>
+</li>
 <li><a href="#setup">Setup</a></li>
-  </ul></li>
+</ul>
+</li>
 </ul>
 <p>Oak supports CLIR (Cross Language Information Retrieval) by using <i>Machine Translation</i> to decorate search queries. Such an extension is provided within the <i>oak-search-mt</i> bundle.</p>
 <div class="section">
@@ -276,13 +275,10 @@
 <div class="section">
 <h4><a name="Setup"></a><a name="setup"></a> Setup</h4>
 <p>Multiple <i>MTFulltextQueryTermsProvider</i> can be configured (for different language pairs) by using <i>MTFulltextQueryTermsProviderFactory</i> OSGi configuration factory. In order to instantiate a <i>MTFulltextQueryTermsProviderFactory</i> the following properties need to be configured:</p>
-
 <ul>
-  
+
 <li><i>path.to.config</i> -&gt; the path to the <i>joshua.config</i> configuration file (e.g. of a downloaded language pack)</li>
-  
 <li><i>node.types</i> -&gt; the list of node types for which query time MT expansion should be done</li>
-  
 <li><i>min.score</i> -&gt; the minimum score (between 0 and 1) for a translated sentence / token to be used while expanding the query (this is used to filter out low quality translations)</li>
 </ul></div></div></div>
         </div>

Modified: jackrabbit/site/live/oak/docs/query/solr.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/solr.html?rev=1835390&r1=1835389&r2=1835390&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/query/solr.html (original)
+++ jackrabbit/site/live/oak/docs/query/solr.html Mon Jul  9 08:53:17 2018
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia Site Renderer 1.7.4 at 2018-05-24 
+ | Generated by Apache Maven Doxia Site Renderer 1.8.1 at 2018-07-09 
  | Rendered using Apache Maven Fluido Skin 1.6
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180524" />
+    <meta name="Date-Revision-yyyymmdd" content="20180709" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak &#x2013; Solr Index</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.6.min.css" />
@@ -136,7 +136,7 @@
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-        <li id="publishDate">Last Published: 2018-05-24<span class="divider">|</span>
+        <li id="publishDate">Last Published: 2018-07-09<span class="divider">|</span>
 </li>
           <li id="projectVersion">Version: 1.10-SNAPSHOT</li>
         </ul>
@@ -241,35 +241,35 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
-  --><div class="section">
+  -->
+<div class="section">
 <h2><a name="Solr_Index"></a>Solr Index</h2>
 <p>The Solr index is mainly meant for full-text search (the &#x2018;contains&#x2019; type of queries):</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">//*[jcr:contains(., 'text')]
+<div>
+<div>
+<pre class="source">//*[jcr:contains(., 'text')]
 </pre></div></div>
+
 <p>but is also able to search by path and property restrictions. Primary type restriction support is also provided by it&#x2019;s not recommended as it&#x2019;s usually much better to use the <a href="query.html#The_Node_Type_Index">node type index</a> for such kind of queries.</p>
 <p>Even if it&#x2019;s not just a full-text index, it&#x2019;s recommended to use it asynchronously (see <tt>Oak#withAsyncIndexing</tt>) because, in most production scenarios, it&#x2019;ll be a &#x2018;remote&#x2019; index and therefore network latency / errors would have less impact on the repository performance.</p>
 <p>The index definition node for a Solr-based index:</p>
-
 <ul>
-  
+
 <li>must be of type <tt>oak:QueryIndexDefinition</tt></li>
-  
 <li>must have the <tt>type</tt> property set to <b><tt>solr</tt></b></li>
-  
 <li>must contain the <tt>async</tt> property set to the value <tt>async</tt>, this is what sends the index update process to a background thread.</li>
 </ul>
 <p><i>Optionally</i> one can add</p>
-
 <ul>
-  
+
 <li>the <tt>reindex</tt> flag which when set to <tt>true</tt>, triggers a full content re-index.</li>
 </ul>
 <p>Example:</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">{
+<div>
+<div>
+<pre class="source">{
   NodeBuilder index = root.child(&quot;oak:index&quot;);
   index.child(&quot;solr&quot;)
     .setProperty(&quot;jcr:primaryType&quot;, &quot;oak:QueryIndexDefinition&quot;, Type.NAME)
@@ -278,15 +278,14 @@
     .setProperty(&quot;reindex&quot;, true);
 }
 </pre></div></div>
+
 <p>The Oak Solr index creates one document in the Solr index for each node in the repository, each of such documents has usually at least a field for each property associated with the related node. Indexing of properties can be done by name: e.g. property &#x2018;jcr:title&#x2019; of a node is written into a field &#x2018;jcr:title&#x2019; of the corresponding Solr document in the index, or by type: e.g. properties &#x2018;jcr:data&#x2019; and &#x2018;binary_content&#x2019; of type <i>binary</i> are written into a field &#x2018;binary_data&#x2019; that&#x2019;s responsible for the indexing of all fields having that type and thus properly configured for hosting such type of data.</p>
 <div class="section">
 <h3><a name="Configuring_the_Solr_index"></a>Configuring the Solr index</h3>
-<p>Besides the index definition parameters mentioned above, a number of additional parameters can be defined in  Oak Solr index configuration. Such a configuration is composed by:</p>
-
+<p>Besides the index definition parameters mentioned above, a number of additional parameters can be defined in Oak Solr index configuration. Such a configuration is composed by:</p>
 <ul>
-  
+
 <li>the search / indexing configuration (see <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/OakSolrConfiguration.html">OakSolrConfiguration</a>)</li>
-  
 <li>the Solr server configuration (see <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/SolrServerConfiguration.html">SolrServerConfiguration</a>)</li>
 </ul></div>
 <div class="section">
@@ -317,20 +316,20 @@
 <h4><a name="Descendant_path_field"></a>Descendant path field</h4>
 <p>The name of the field to be used for searching for nodes descendants of a certain node.</p>
 <p>Default is &#x2018;path_des&#x2019;.</p>
-<p>E.g. The Solr query to find all the descendant nodes of /a/b would be &#x2018;path_des:\/a\/b&#x2019;.</p></div>
+<p>E.g. The Solr query to find all the descendant nodes of /a/b would be &#x2018;path_des:/a/b&#x2019;.</p></div>
 <div class="section">
 <h4><a name="Children_path_field"></a>Children path field</h4>
 <p>The name of the field to be used for searching for child nodes of a certain node.</p>
 <p>Default is &#x2018;path_child&#x2019;.</p>
-<p>E.g. The Solr query to find all the child nodes of /a/b would be &#x2018;path_child:\/a\/b&#x2019;.</p></div>
+<p>E.g. The Solr query to find all the child nodes of /a/b would be &#x2018;path_child:/a/b&#x2019;.</p></div>
 <div class="section">
 <h4><a name="Parent_path_field"></a>Parent path field</h4>
 <p>The name of the field to be used for searching for parent node of a certain node.</p>
 <p>Default is &#x2018;path_anc&#x2019;.</p>
-<p>E.g. The Solr query to find the parent node of /a/b would be &#x2018;path_anc:\/a\/b&#x2019;.</p></div>
+<p>E.g. The Solr query to find the parent node of /a/b would be &#x2018;path_anc:/a/b&#x2019;.</p></div>
 <div class="section">
 <h4><a name="Property_restriction_fields"></a>Property restriction fields</h4>
-<p>The (optional) mapping of property names into Solr fields, so that a mapping jcr:title=foo is defined each node having  the property jcr:title will have its correspondant Solr document having a property foo indexed with the value of the  jcr:title property.</p>
+<p>The (optional) mapping of property names into Solr fields, so that a mapping jcr:title=foo is defined each node having the property jcr:title will have its correspondant Solr document having a property foo indexed with the value of the jcr:title property.</p>
 <p>Default is no mapping, therefore the default mechanism of mapping property names to field names is performed.</p></div>
 <div class="section">
 <h4><a name="Used_properties"></a>Used properties</h4>
@@ -340,7 +339,7 @@
 <div class="section">
 <h4><a name="aIgnored_properties"></a>&#xa0;Ignored properties</h4>
 <p>A blacklist of properties to be ignored while indexing and searching by the Solr index.</p>
-<p>Such a blacklist makes sense (it will be taken into account by the Solr index) only if the <a href="#Used_properties">Used properties</a>  option doesn&#x2019;t have any value.</p>
+<p>Such a blacklist makes sense (it will be taken into account by the Solr index) only if the <a href="#Used_properties">Used properties</a> option doesn&#x2019;t have any value.</p>
 <p>Default is the following array: <i>(&#x201c;rep:members&#x201d;, &#x201c;rep:authorizableId&#x201d;, &#x201c;jcr:uuid&#x201d;, &#x201c;rep:principalName&#x201d;, &#x201c;rep:password&#x201d;}</i>.</p></div>
 <div class="section">
 <h4><a name="Commit_policy"></a>Commit policy</h4>
@@ -353,11 +352,11 @@
 <div class="section">
 <h4><a name="Rows"></a>Rows</h4>
 <p>The number of documents per &#x2018;page&#x2019; to be fetched for each query.</p>
-<p>Default is _Integer.MAX<i>VALUE</i> (was <i>50</i> in Oak 1.0).</p></div>
+<p>Default is <i>Integer.MAX_VALUE</i> (was <i>50</i> in Oak 1.0).</p></div>
 <div class="section">
 <h4><a name="Collapse_jcr:content_nodes"></a>Collapse <i>jcr:content</i> nodes</h4>
 <p><tt>@since 1.3.4, 1.2.4, 1.0.18</tt></p>
-<p>Whether the <a class="externalLink" href="https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results">Collapsing query parser</a> should be used when searching in order to collapse nodes that are descendants of &#x2018;jcr:content&#x2019; nodes into the &#x2018;jcr:content&#x2019; node only. E.g. if normal query results would include &#x2018;/a/jcr:content&#x2019; and &#x2018;/a/jcr:content/b/&#x2019;, with this option enabled only &#x2018;/a/jcr:content&#x2019; would be returned by Solr using the Collapsing query parser. This feature requires an additional field to be indexed, therefore if this is turned on, reindexing should be triggered in order to make it work properly. </p></div>
+<p>Whether the <a class="externalLink" href="https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results">Collapsing query parser</a> should be used when searching in order to collapse nodes that are descendants of &#x2018;jcr:content&#x2019; nodes into the &#x2018;jcr:content&#x2019; node only. E.g. if normal query results would include &#x2018;/a/jcr:content&#x2019; and &#x2018;/a/jcr:content/b/&#x2019;, with this option enabled only &#x2018;/a/jcr:content&#x2019; would be returned by Solr using the Collapsing query parser. This feature requires an additional field to be indexed, therefore if this is turned on, reindexing should be triggered in order to make it work properly.</p></div>
 <div class="section">
 <h4><a name="Collapsed_path_field"></a>Collapsed path field</h4>
 <p><tt>@since 1.3.4, 1.2.4, 1.0.18</tt></p>
@@ -367,17 +366,13 @@
 <p>TBD</p></div></div>
 <div class="section">
 <h4><a name="Setting_up_the_Solr_server"></a>Setting up the Solr server</h4>
-<p>For the Solr index to work Oak needs to be able to communicate with a Solr instance / cluster. Apache Solr supports multiple deployment architectures: </p>
-
+<p>For the Solr index to work Oak needs to be able to communicate with a Solr instance / cluster. Apache Solr supports multiple deployment architectures:</p>
 <ul>
-  
+
 <li>embedded Solr instance running in the same JVM the client runs into</li>
-  
 <li>single remote instance</li>
-  
 <li>master / slave architecture, eventually with multiple shards and replicas</li>
-  
-<li>SolrCloud cluster, with Zookeeper instance(s) to control a dynamic, resilient set of Solr servers for high  availability and fault tolerance</li>
+<li>SolrCloud cluster, with Zookeeper instance(s) to control a dynamic, resilient set of Solr servers for high availability and fault tolerance</li>
 </ul>
 <p>The Oak Solr index can be configured to either use an &#x2018;embedded Solr server&#x2019; or a &#x2018;remote Solr server&#x2019; (being able to connect to a single remote instance or to a SolrCloud cluster via Zookeeper).</p>
 <div class="section">
@@ -385,23 +380,20 @@
 <p>Depending on the use case, different Solr server deployments are recommended.</p>
 <div class="section">
 <h6><a name="Embedded_Solr_server"></a>Embedded Solr server</h6>
-<p>The embedded Solr server is recommended for developing and testing the Solr index for an Oak repository. With that an in-memory Solr instance is started in the same JVM of the Oak repository, without HTTP bindings (for security purposes as it&#x2019;d allow HTTP access to repository data independently of ACLs). Configuring an embedded Solr server mainly consists of providing the path to a standard <a class="externalLink" href="https://wiki.apache.org/solr/SolrTerminology">Solr home dir</a> (<i>solr.home.path</i> Oak property) to be used to start Solr; this path can be either relative or absolute, if such a path would not exist then the default configuration provided with <i>oak-solr-core</i> artifact would be put in the given path. To start an embedded Solr server with a custom configuration (e.g. different schema.xml / solrconfig.xml than the default  ones) the (modified) Solr home files would have to be put in a dedicated directory, according to Solr home structure, so  that th
 e solr.home.path property can be pointed to that directory.</p></div>
+<p>The embedded Solr server is recommended for developing and testing the Solr index for an Oak repository. With that an in-memory Solr instance is started in the same JVM of the Oak repository, without HTTP bindings (for security purposes as it&#x2019;d allow HTTP access to repository data independently of ACLs). Configuring an embedded Solr server mainly consists of providing the path to a standard <a class="externalLink" href="https://wiki.apache.org/solr/SolrTerminology">Solr home dir</a> (<i>solr.home.path</i> Oak property) to be used to start Solr; this path can be either relative or absolute, if such a path would not exist then the default configuration provided with <i>oak-solr-core</i> artifact would be put in the given path. To start an embedded Solr server with a custom configuration (e.g. different schema.xml / solrconfig.xml than the default ones) the (modified) Solr home files would have to be put in a dedicated directory, according to Solr home structure, so that the 
 solr.home.path property can be pointed to that directory.</p></div>
 <div class="section">
 <h6><a name="Single_remote_Solr_server"></a>Single remote Solr server</h6>
-<p>A single (remote) Solr instance is the simplest possible setup for using the Oak Solr index in a production environment. Oak will communicate to such a Solr server through Solr&#x2019;s HTTP APIs (via <a class="externalLink" href="http://wiki.apache.org/solr/Solrj">SolrJ</a> client). Configuring a single remote Solr instance consists of providing the URL to connect to in order to reach the <a class="externalLink" href="https://wiki.apache.org/solr/SolrTerminology">Solr core</a> that will host the Solr index for the Oak repository via the <i>solr.http.url</i>  property which will have to contain such a URL (e.g. _<a class="externalLink" href="http://10.10.1.101:8983/solr/oak_)">http://10.10.1.101:8983/solr/oak_)</a>. All the configuration and tuning of Solr, other than what&#x2019;s described on this page, will have to be performed on the Solr side; <a class="externalLink" href="http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-solr-core/src/main/resources/solr/">sample Solr c
 onfiguration</a> files (schema.xml, solrconfig.xml, etc.) to start with can be found in <i>oak-solr-core</i> artifact.</p></div>
+<p>A single (remote) Solr instance is the simplest possible setup for using the Oak Solr index in a production environment. Oak will communicate to such a Solr server through Solr&#x2019;s HTTP APIs (via <a class="externalLink" href="http://wiki.apache.org/solr/Solrj">SolrJ</a> client). Configuring a single remote Solr instance consists of providing the URL to connect to in order to reach the [Solr core] (<a class="externalLink" href="https://wiki.apache.org/solr/SolrTerminology">https://wiki.apache.org/solr/SolrTerminology</a>) that will host the Solr index for the Oak repository via the <i>solr.http.url</i> property which will have to contain such a URL (e.g. <i><a class="externalLink" href="http://10.10.1.101:8983/solr/oak">http://10.10.1.101:8983/solr/oak</a></i>). All the configuration and tuning of Solr, other than what&#x2019;s described on this page, will have to be performed on the Solr side; <a class="externalLink" href="http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oa
 k-solr-core/src/main/resources/solr/">sample Solr configuration</a> files (schema.xml, solrconfig.xml, etc.) to start with can be found in <i>oak-solr-core</i> artifact.</p></div>
 <div class="section">
 <h6><a name="SolrCloud_cluster"></a>SolrCloud cluster</h6>
-<p>A <a class="externalLink" href="https://cwiki.apache.org/confluence/display/solr/SolrCloud">SolrCloud</a> cluster is the recommended setup for an Oak Solr index in production as it provides a scalable and fault tolerant architecture. In order to configure a SolrCloud cluster the host of the Zookeeper instance / ensemble managing the Solr servers has to be provided in the <i>solr.zk.host</i> property (e.g. <i>10.1.1.108:9983</i>) since the SolrJ client for SolrCloud communicates directly with Zookeeper. The <a class="externalLink" href="https://wiki.apache.org/solr/SolrTerminology">Solr collection</a> to be used within Oak is named <i>oak</i>, having a replication  factor of 2 and using 2 shards; this means in the default setup the SolrCloud cluster would have to be composed by at  least 4 Solr servers as the index will be split into 2 shards and each shard will have 2 replicas. SolrCloud also allows the hot deploy of configuration files to be used for a certain collection so whil
 e setting up the  collection to be used for Oak with the needed files before starting the cluster, configuration files can also be uploaded  from a local directory, this is controlled by the <i>solr.conf.dir</i> property of the &#x2018;Oak Solr remote server configuration&#x2019;. For a detailed description of how SolrCloud works see the <a class="externalLink" href="https://cwiki.apache.org/confluence/display/solr/SolrCloud">Solr reference guide</a>.</p></div></div>
+<p>A <a class="externalLink" href="https://cwiki.apache.org/confluence/display/solr/SolrCloud">SolrCloud</a> cluster is the recommended setup for an Oak Solr index in production as it provides a scalable and fault tolerant architecture. In order to configure a SolrCloud cluster the host of the Zookeeper instance / ensemble managing the Solr servers has to be provided in the <i>solr.zk.host</i> property (e.g. <i>10.1.1.108:9983</i>) since the SolrJ client for SolrCloud communicates directly with Zookeeper. The <a class="externalLink" href="https://wiki.apache.org/solr/SolrTerminology">Solr collection</a> to be used within Oak is named <i>oak</i>, having a replication factor of 2 and using 2 shards; this means in the default setup the SolrCloud cluster would have to be composed by at least 4 Solr servers as the index will be split into 2 shards and each shard will have 2 replicas. SolrCloud also allows the hot deploy of configuration files to be used for a certain collection so while 
 setting up the collection to be used for Oak with the needed files before starting the cluster, configuration files can also be uploaded from a local directory, this is controlled by the <i>solr.conf.dir</i> property of the &#x2018;Oak Solr remote server configuration&#x2019;. For a detailed description of how SolrCloud works see the <a class="externalLink" href="https://cwiki.apache.org/confluence/display/solr/SolrCloud">Solr reference guide</a>.</p></div></div>
 <div class="section">
 <h5><a name="OSGi_environment"></a>OSGi environment</h5>
 <p>Create an index definition for the Solr index, as described <a href="#Solr_index">above</a>. Once the query index definition node has been created, access OSGi ConfigurationAdmin via e.g. Apache Felix WebConsole:</p>
-
 <ol style="list-style-type: decimal">
-  
+
 <li>find the &#x2018;Oak Solr indexing / search configuration&#x2019; item and eventually change configuration properties as needed</li>
-  
-<li>find either the &#x2018;Oak Solr embedded server configuration&#x2019; or &#x2018;Oak Solr remote server configuration&#x2019; items depending  on the chosen Solr architecture and eventually change configuration properties as needed</li>
-  
+<li>find either the &#x2018;Oak Solr embedded server configuration&#x2019; or &#x2018;Oak Solr remote server configuration&#x2019; items depending on the chosen Solr architecture and eventually change configuration properties as needed</li>
 <li>find the &#x2018;Oak Solr server provider&#x2019; item and select the chosen provider (&#x2018;remote&#x2019; or &#x2018;embedded&#x2019;)</li>
 </ol></div></div>
 <div class="section">
@@ -409,7 +401,7 @@
 <div class="section">
 <h5><a name="Aggregation"></a>Aggregation</h5>
 <p><tt>@since Oak 1.1.4, 1.0.13</tt></p>
-<p>Solr index supports query time aggregation, that can be enabled in OSGi by setting <tt>SolrQueryIndexProviderService</tt> service property <tt>query.aggregation</tt> to true. </p></div>
+<p>Solr index supports query time aggregation, that can be enabled in OSGi by setting <tt>SolrQueryIndexProviderService</tt> service property <tt>query.aggregation</tt> to true.</p></div>
 <div class="section">
 <h5><a name="Suggestions"></a>Suggestions</h5>
 <p><tt>@since Oak 1.1.17, 1.0.15</tt></p>
@@ -425,21 +417,26 @@
 <p><tt>@since Oak 1.3.14</tt></p>
 <p>In order to enable proper usage of facets in Solr index the following dynamic field needs to be added to the <i>schema.xml</i></p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">    &lt;dynamicField name=&quot;*_facet&quot; type=&quot;string&quot; indexed=&quot;false&quot; stored=&quot;false&quot; docValues=&quot;true&quot; multiValued=&quot;true&quot;/&gt;
+<div>
+<div>
+<pre class="source">    &lt;dynamicField name=&quot;*_facet&quot; type=&quot;string&quot; indexed=&quot;false&quot; stored=&quot;false&quot; docValues=&quot;true&quot; multiValued=&quot;true&quot;/&gt;
 </pre></div></div>
+
 <p>with dedicated <i>copyFields</i> for specific properties.</p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">    &lt;copyField source=&quot;jcr:primaryType&quot; dest=&quot;jcr:primaryType_facet&quot;/&gt; &lt;!-- facet on jcr:primaryType field/property --&gt;
-</pre></div></div></div>
+<div>
+<div>
+<pre class="source">    &lt;copyField source=&quot;jcr:primaryType&quot; dest=&quot;jcr:primaryType_facet&quot;/&gt; &lt;!-- facet on jcr:primaryType field/property --&gt;
+</pre></div></div>
+</div>
 <div class="section">
 <h4><a name="Persisted_configuration"></a>Persisted configuration</h4>
 <p><tt>@since Oak 1.4.0</tt></p>
-<p>It&#x2019;s possible to create (multiple) Solr indexes via persisted configuration. A persisted Oak Solr index is created whenever an index definition with <i>type = solr</i> has a child node named <i>server</i> and such a child node has the <i>solrServerType</i> property set (to either <i>embedded</i> or <i>remote</i>). If no such child node exists, an Oak Solr index will be only created upon explicit registration of a [SolrServerProvider]  e.g. via OSGi. All the <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/OakSolrConfiguration.html">OakSolrConfiguration</a>  and <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/SolrServerConfiguration.html">SolrServerConfiguration</a>  properties are exposed and configurable, see also <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/trunk/oak-sol
 r-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/configuration/nodestate/OakSolrNodeStateConfiguration.java#L245">OakSolrNodeStateConfiguration#Properties</a>  and <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/trunk/oak-solr-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/configuration/nodestate/NodeStateSolrServerConfigurationProvider.java#L94">NodeStateSolrServerConfigurationProvider#Properties</a></p>
+<p>It&#x2019;s possible to create (multiple) Solr indexes via persisted configuration. A persisted Oak Solr index is created whenever an index definition with <i>type = solr</i> has a child node named <i>server</i> and such a child node has the <i>solrServerType</i> property set (to either <i>embedded</i> or <i>remote</i>). If no such child node exists, an Oak Solr index will be only created upon explicit registration of a [SolrServerProvider] e.g. via OSGi. All the <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/OakSolrConfiguration.html">OakSolrConfiguration</a> and <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/SolrServerConfiguration.html">SolrServerConfiguration</a> properties are exposed and configurable, see also <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/trunk/oak-solr-c
 ore/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/configuration/nodestate/OakSolrNodeStateConfiguration.java#L245">OakSolrNodeStateConfiguration#Properties</a> and <a class="externalLink" href="https://github.com/apache/jackrabbit-oak/blob/trunk/oak-solr-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/configuration/nodestate/NodeStateSolrServerConfigurationProvider.java#L94">NodeStateSolrServerConfigurationProvider#Properties</a></p>
 
-<div class="source">
-<div class="source"><pre class="prettyprint">/oak:index/solrRemote
+<div>
+<div>
+<pre class="source">/oak:index/solrRemote
   - jcr:primaryType = &quot;oak:QueryIndexDefinition&quot;
   - type = &quot;solr&quot;
   - async = &quot;async&quot;
@@ -448,23 +445,20 @@
     - solrServerType = &quot;remote&quot;
     - httpUrl = &quot;http://localhost:8983/solr/oak&quot;
 </pre></div></div>
+
 <p>If such configurations exists in the repository the <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/configuration/nodestate/NodeStateSolrServersObserver.html">NodeStateSolrServersObserver</a> should be registered too (e.g. via <a class="externalLink" href="http://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/plugins/index/solr/osgi/NodeStateSolrServersObserverService.html">NodeStateSolrServersObserverService</a> OSGi service).</p></div>
 <div class="section">
 <h4><a name="Notes"></a>Notes</h4>
 <p>As of Oak version 1.0.0:</p>
-
 <ul>
-  
+
 <li>Solr index doesn&#x2019;t support search using relative properties, see <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-1835">OAK-1835</a>.</li>
-  
 <li>Lucene can only be used for full-text queries, Solr can be used for full-text search <i>and</i> for JCR queries involving path, property and primary type restrictions.</li>
 </ul>
 <p>As of Oak version 1.2.0:</p>
-
 <ul>
-  
+
 <li>Solr index doesn&#x2019;t support index time aggregation, but only query time aggregation</li>
-  
 <li>Lucene and Solr can be both used for full text, property and path restrictions</li>
 </ul></div></div></div>
         </div>



Mime
View raw message