jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nitin Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-8166) Index definition with orderable property definitions with and without functions breaks index
Date Wed, 03 Apr 2019 11:07:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808612#comment-16808612
] 

Nitin Gupta commented on OAK-8166:
----------------------------------

The problem here is that -

Since we have 2 property definitions in the index definition having the same property name
(one is using functions and other is not) , the piece here  [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexDefinition.java#L1179#L1198]
, adds these 2 property definitions to the propAggregate List which is then used here [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/Aggregate.java#L196#L202]
to create matchers list which is then used here [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/Aggregate.java#L171#L180]
. Please note that the property name in both the matchers will the same and that will cause
problems .

 

Finally , the loop on the matchers list ends up calling LuceneDocumentMaker#indexTypeOrderedFields
.  This [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneDocumentMaker.java#L288#L291]

is specifically where the duplicate field is added .

 

I have 2 solutions in mind here -
 # Fix the code piece in IndexDefinition so that adding property definition to aggregateList
is ignored in case of functions . But this might not be ideal since at this point of time
these 2 property definitions are different objects (only there names unfortunately are the
same)
 # Second approach is to fix [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneDocumentMaker.java#L288#L291]
and check if the field is already present in the doc then don't add it . The only thing to
consider here is that we are not supporting Multivalue doc field  value support in lucene-
but if that would have been - this issue wouldn't have surfaced I suppose ?

 

I have attached a patch for the second approach - [^OAK-8166_1.patch]

 

[~tmueller] , [~catholicon]

> Index definition with orderable property definitions with and without functions breaks
index
> --------------------------------------------------------------------------------------------
>
>                 Key: OAK-8166
>                 URL: https://issues.apache.org/jira/browse/OAK-8166
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.8.12
>            Reporter: Tom Blackford
>            Priority: Major
>         Attachments: OAK-8166_1.patch
>
>
> If an index definition contains the same orderable property with and without functions,
it will fail to index any node which contains that property. The failure will be logged as
[1].
> Steps to reproduce:
> * Configure index with the two property definitions shown at [2].
> * Refresh the index definition
> * Modify a node that falls under the definition - it will fail with the exception shown
at [1]
> * Modify the 'non-function' index definition to not be orderable (orderable=false)
> * Refresh the index definition
> * Modify the same node - note there is no exception.
> Thanks to [~catholicon] for assistance identifying root cause.
> [1]
> {code}
> 25.03.2019 15:39:04.135 *WARN* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor
Failed to index the node [/content/dam/Unknown-2.png]
> java.lang.IllegalArgumentException: DocValuesField ":dvjcr:content/metadata/dc:title"
appears more than once in this document (only one value is allowed per field)
> 	at org.apache.lucene.index.SortedDocValuesWriter.addValue(SortedDocValuesWriter.java:62)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.DocValuesProcessor.addSortedField(DocValuesProcessor.java:125)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:59) [org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1534) [org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507) [org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.updateDocument(DefaultIndexWriter.java:86)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.addOrUpdate(LuceneIndexEditor.java:258)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:140)
[org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.jackrabbit.oak.spi.commit.CompositeEditor.leave(CompositeEditor.java:74)
[org.apache.jackrabbit.oak-store-spi:1.8.9]
> {code}
> [2] 
> {code}
> "dcTitle": {
>     "jcr:primaryType": "nt:unstructured",
>     "nodeScopeIndex": "true",
>     "useInSuggest": "true",
>     "ordered": "true",
>     "propertyIndex": "true",
>     "useInSpellcheck": "true",
>     "name": "jcr:content/metadata/dc:title",
>     "boost": "2.0"
>     },
>   "dcTitleLowercase": {
>     "jcr:primaryType": "nt:unstructured",
>     "ordered": "true",
>     "propertyIndex": "true",
>     "function": "fn:lower-case(jcr:content/metadata/@dc:title)"
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message