lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: Partial updates?
Date Fri, 28 Aug 2009 18:42:36 GMT
Don,

I started work on fixing this a while back. However I plan to
resume again soon. Basically one would be able to update fields
to a parallel index, without reindexing the entire document.
There are other use cases I've seen for this such as caching.

-J

On Fri, Aug 28, 2009 at 8:49 AM, Don Werve<donw@madwombat.com> wrote:
> Short version:
>
> Is there a way to either do partial updates to documents (update/add one or
> two fields only), or to search across multiple documents grouped by a
> (non-unique) key stored in a field?
>
> Long version:
>
> I've run into an issue with the way I'm indexing documents for a new
> product, and figure that somebody else has run into the same problem.  In a
> nutshell, we're building a system that deals with a lot of incoming and
> outgoing text documents (email, word docs, short comments, etc), grouped
> together by some common factor (basically, email threads), and want to do
> full-text search across those threads.
>
> We've settled on Solr, of course. :)
>
> Right now, I'm adding each new incoming/outgoing message as a new document,
> and can search just fine, unless I want to look for multiple terms that span
> documents.  So, "foo" is in the first document, "bar" is in the second, and
> although they both have a 'thread_id' field identifying them as belonging to
> the same group, searching for "+foo +bar" doesn't yield results (which is
> not surprising).
>
> Now, I can modify the code to store one document for each group of messages
> without too much work.  But as I understand it, this means that for every
> new message coming in, I need to hand an aggregate of all previous messages
> to the indexer, because Solr will re-create the document (which indexes the
> entire group of messages) when I do update/add.  Since there can be some
> fairly large files sitting in there (50-100M in some cases), I'd rather not
> have to shove that down Solr's pipe every time something changes.
>
> So, first question, is what I think I know about update/add correct?
>
> Second, if so, is there a way that I can update single-valued fields and
> append new multivalued fields, without having to re-index the whole
> document?
>
> Third, am I just totally wrong about the way I'm trying to do this, and is
> there a better way?
>
> Thanks-in-advance!
>

Mime
View raw message