lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: Spliting the Lucene
Date Fri, 08 Dec 2006 09:01:54 GMT
howard chen wrote:
> Hi,
> A friend from Hadoop told me someone in the list has code for spliting
> the Lucene index, can anyone point me to the right place?

You probably refer to the emails we exchanged with Dennis Kubes - in 
that case there was no index splitting involved, rather the body of 
documents to be indexed was split into parts and then indexed separately 
to form many smaller indexes.

True index splitter doesn't exist (yet), but it shouldn't be too 
difficult to implement, just tedious - around 3 days of work ... Some 
people on this list also contemplated a semi-splitter (also non-existent 
yet), which splits the index only on segment boundaries - this should be 
much easier to implement, as it's just a question of copying selected 
segments into new places and re-creating "segments" files - although 
this method is much less flexible than a true splitter.

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message