lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Indexing slower in trunk
Date Thu, 16 Jun 2011 16:05:20 GMT
OK, after more tests I'm pretty sure that my personal machine
that I'm testing on is just resource-constrained, leading to the
results I mentioned before. After all, I'm running my Solr
instance, the indexing program, etc on a Macbook
with 1 CPU and 2 cores. The indexing program is parsing the
XML.

On a proper setup, where the indexing machine was separate
from the machine(s) feeding the index process I suspect this would
be a different story. Hmmmm, I may try that sometime too....

Best
Erick

On Tue, Jun 14, 2011 at 9:25 AM, Uwe Schindler <uwe@thetaphi.de> wrote:
> For simple removing deletes, there is also IW.expungeDeletes(), which is
> less intensive! Not sure if solr support this, too, but as far as I know
> there is an issue open.
>
> Also please note: As soon as one segment is selected for merging (the merge
> policy may also do this dependent on the number of deletes in a segment), it
> will reclaim all deleted ressources - that's what merging does. So expunging
> deletes once per week is a good idea, if your index consists of very old and
> large segments that are rarely merged anymore and lots of documents are
> deleted from them.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerickson@gmail.com]
>> Sent: Tuesday, June 14, 2011 3:19 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Indexing slower in trunk
>>
>> Optimization used to have a very noticeable impact on search speed prior
> to
>> some index format changes from quite a while ago.
>>
>> At this point the effect is much less noticeable, but the thing optimize
> does
>> do is reclaim resources from deleted documents. If you have lots of
>> deletions, it's a good idea to periodically optimize, but in that case
> it's often
>> done pretty infrequently (once a
>> day/week/month) rather than as part of any ongoing indexing process.
>>
>> Best
>> Erick
>>
>> 2011/6/14 Yury Kats <yurykats@yahoo.com>:
>> > On 6/14/2011 4:28 AM, Uwe Schindler wrote:
>> >> indexing and optimizing was only a
>> >> good idea pre Lucene-2.9, now it's mostly obsolete)
>> >
>> > Could you please elaborate on this? Is optimizing obsolete in general
>> > or after indexing new documents? Is it obsolete after deletions? And
>> > what it "mostly"?
>> >
>> > Thanks!
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> > additional commands, e-mail: dev-help@lucene.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message