nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: IndexSorter optimizer
Date Thu, 22 Dec 2005 07:07:26 GMT
American Jeff Bowden wrote:

> Andrzej Bialecki wrote:
>
>> Hi,
>>
>> I'm happy to report that further tests performed on a larger index 
>> seem to show that the overall impact of the IndexSorter is definitely 
>> positive: performance improvements are significant, and the overall 
>> quality of results seems at least comparable, if not actually better.
>
>
>
> This is very interesting.  What's the computational complexity and 
> disk I/O for index sorting as compared other operations on an index 
> (e.g. adding/deleting N documents and running optimize)?


Comparable to optimize(). All index data needs to be read and copied, so 
the whole process is I/O bound.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Mime
View raw message