lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <>
Subject Re: How to manage resource out of index?
Date Wed, 07 Jul 2010 06:30:05 GMT
thank you.

2010/7/7 Rebecca Watson <>:
> hi li,
> i looked at doing something similar - where we only index the text
> but retrieve search results / highlight from files -- we ended up giving
> up because of the amount of customisation required in solr -- mainly
> because we wanted the distributed search functionality in solr which
> meant making
> sure the original file ended up the same filing system i.e. machine too!).
> we ended up just storing the main text field too even though there was a
> bit of text -- in the end solr/lucene can handle the index size fine and
> disk space is cheaper than man-hours to customise solr/lucene to work
> in this way!
> that was our conclusion anyway and it works fine -- we also have
> separate index / search server(s) so we don't care about merge time
> either -- and as i said above - we use the distributed search so don't tend
> to need to merge very large indexes anyway.
> when your system grows / you go into production you'll probably split
> the indexes too to use solr's distributed search func. for the sake of
> query speed).
> hope that helps,
> bec :)
> On 7 July 2010 14:07, Li Li <> wrote:
>> I used to store full text into lucene index. But I found it's very
>> slow when merging index because when merging 2 segments it copy the
>> fdt files into a new one. So I want to only index full text. But When
>> searching I need the full text for applications such as hightlight and
>> view full text. I can store the full text by <url,full text> pair in
>> database and load it to memory. And When I search in lucene(or solr),
>> I retrive url of doc first, then use url to get full text. But when
>> they are stored separately, it is hard to managed. They may be not
>> consistent with each other. Does lucene or solr provied any method to
>> ease this problem? Or any one  has some experience of this problem?
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message