lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <...@teamware.com>
Subject Re: Deleting a single TermPosition <doc, frequency, position> for a Document
Date Tue, 08 Jan 2008 08:00:07 GMT
Otis Gospodnetic wrote:
> Is your user field stored?  If so, you cold find the target Document, get the
> user field value, modify it, and re-add it to the Document (or something
> close to this -- I am doing this with one of the indices on simpy.com and
> it's working well).

No, it's not stored.  I'm not sure I understand how you 'modify it' as it's not 
possible to modify an existing Document or do you mean you fetch all the stored 
fields from the existing Document, delete the existing Document then add it back 
with the modified field?

I have ~20 fields per Document and most are not stored

Antony


> 
> Otis
> 
> -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> ----- Original Message ---- From: Antony Bowesman <adb@teamware.com> To:
> java-user@lucene.apache.org Sent: Tuesday, January 8, 2008 12:47:05 AM 
> Subject: Deleting a single TermPosition <doc, frequency, position> for a
> Document
> 
> I'd like to 'update' a single Document in a Lucene index.  In practice, this
>  'update' is actually just a removal of a single TermPosition for a given
> Term for a given doc Id.
> 
> I don't think this is currently possible, but would it be easy to change
> Lucene to support this type of usage?
> 
> The reason for this is to optimise my index usage.  I'm using Lucene to index
>  arbitrary data sets, however, in some data sets, each Document is indexed
> once for each user who has an interest in the document.  For example, with 
> mail data, a mail item (with a single recipient) is stored as two Documents,
> once with the 'user' field set to the sender's user Id and again with the
> user field set to the recipents's user Id.  Searches just filter mail for a
> given user by the user field.
> 
> When one of those users deletes the mail, the Document with the 'user' field
> is simply deleted.  One of the original reasons for doing this was to enable
>  horizontal partitioning of the index.  This works nicely, but of course the
>  index is bigger than necessary and the number of terms positions is at least
>  double what is necessary.
> 
> I had thought to originally indexed the data once, with the user field set to
>  the sender and recipient user Id, but when the sender or recipient deletes
> the mail from their mailbox, searching becomes more complicated as the index
> does not reflect the external database state unless the mail is reindexed.
> 
> Is this something other's have wanted or are there other solutions to this
> problem?
> 
> Thanks Antony
> 
> 
> 
> 
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional
> commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> 
> 
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional
> commands, e-mail: java-user-help@lucene.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message