lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129
Date Mon, 04 Feb 2019 05:02:40 GMT
dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r253346676
 
 

 ##########
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##########
 @@ -0,0 +1,309 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document and comments
as child documents -- or products as parent documents and sizes, colors, or other variations
as child documents.
+The parent with all children is referred to as a "block" and it explains some of the nomenclature
of related features.
+At query time, the <<other-parsers.adoc#block-join-query-parsers,Block Join Query Parsers>>
can search these relationships,
+ and the `<<transforming-result-documents.adoc#child-childdoctransformerfactory,[child]>>`
Document Transformer can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually yields much
faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be computed.
+However, nested documents are less flexible than query time joins as it imposes rules that
some applications may not be able to accept.
+
+.Note
+[NOTE]
+====
+A big limitation is that the whole block of parent-children documents must be updated or
deleted together, not separately.
+In other words, even if a single child document or the parent document is changed, the whole
block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic query failures
or incorrect results.
+====
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is also supported
by <<using-solrj.adoc#using-solrj,SolrJ>> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed field `\_root_`. The value of that field is populated
automatically and is the same for all documents in the block, regardless of the inheritance
depth. The id of the top document in every nested hierarchy is populated in this field.
+ * `\_nest_path_` can be configured to store the path of the document in the hierarchy
+ * `\_nest_parent_` can be configured to store the `id` of the parent in the previous level
+ * Nested documents are very much documents in their own right even if certain nested documents
hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Rudimentary Root-only schemas
+ * These schemas do not contain any other nested related fields apart from `\_root_`. +
+   In this mode relationship types(field names) between parents and their children are not
saved. +
+   In this case <<nested-documents.adoc#child-doc-transformer,[child]>> transformer
returns all children under the `\_childDocuments_` field.
+ * The schema must include an indexed, non-stored field `\_root_`. The value of that field
is populated automatically and is the same for all documents in the block, regardless of the
inheritance depth.
 
 Review comment:
   This bullet could be removed; it's very redundant with existing information

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message