lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <>
Subject Re: Split mutable logical document into two Lucene documents
Date Thu, 08 Dec 2011 17:57:38 GMT
It is conceivable that nested documents might help.  I don't know
anything about that so might be way off target.


On Wed, Dec 7, 2011 at 8:46 PM, Brandon Mintern <> wrote:
> We have a document tagging system where documents are composed of two
> types of data:
> Rarely changed (hereafter: "immutable") data - document text and
> metadata that we upload and almost never change. The text can be
> hundreds of pages.
> User created (hereafter: "mutable") data - document properties that
> are set by users of our system. In total a document's properties are
> generally several dozen bytes at most. Even viewing a document changes
> the data (e.g. the document's "viewed" property.
> At present, all data is part of a single Lucene document. The problem
> is that when any piece of mutable data is updated (this happens
> relatively frequently), we have to reindex the entire document. We'd
> like to have two separate indexed Lucene documents per logical
> document, one containing the immutable data and the other containing
> the much smaller and more transient mutable data. When the mutable
> data changes, we can delete that document's mutable Lucene document
> and index a new one very quickly.
> There are two major difficulties when actually performing a search, though:
> 1. We are providing complex queries to retrieve logical documents
> based on information in either of its Lucene documents. It seems
> non-trivial to fetch a logical document in a BooleanQuery with
> Occur.MUST clauses referring to fields in both of the Lucene
> documents.
> 2. We need to sort results (logical document IDs) based on fields in
> either of its Lucene documents.
> Has anyone done anything like this before? Is there functionality I'm
> overlooking that could make this easier?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message