jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-7122) Implement script to compare lucene indexes logically
Date Fri, 05 Jan 2018 10:19:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312866#comment-16312866

Chetan Mehrotra commented on OAK-7122:

Implemented the script at [1]. Currently it build up the structure in memory. If this proves
to be problamatic for large index can look into building the structure on file system

[1] https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/lucene

> Implement script to compare lucene indexes logically
> ----------------------------------------------------
>                 Key: OAK-7122
>                 URL: https://issues.apache.org/jira/browse/OAK-7122
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: run
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8
> With Document Traversal based indexing we have implemented a newer indexing logic. To
validate that index produced by it is is same as one done by existing indexing flow we need
to implement a script which can enable comparing the index content logically
> This was recently discussed on lucene mailing list [1] and suggestion there was it can
be done by un-inverting the index. So to enable that we need to implement a script which can

> # Open a Lucene index
> # Map the Lucene Document to path of node
> # For each document determine what all fields are associated with it (stored and non
> # Dump this content in file sorted by path and for each line field name sorted by name
> Then such dumps can be generated for old and new index and compared via simple text diff
> [1] http://lucene.markmail.org/thread/wt22gk6aufs4uz55

This message was sent by Atlassian JIRA

View raw message