lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From miztaken <>
Subject Inrease the performance of Indexing in Lucene
Date Thu, 19 Jul 2007 04:56:18 GMT

Hi, Please help me.
Its been a month since i am trying lucene.
My requirements are huge, i have to index and search in TB of data. 
I have question regarding three topics:

1. Problem in Indexing
  As i need to index TB of data, so by googling and visiting different forum
i deployed following fashion for indexing:

1. First i created an array of RAMDirectory then i added the documents on
   After crossing certain threshold i dumped it into my drive as tempIndex1

2. I repeated the same process until all documents are indexed in my drive
as tempindex1. tempindex2...

3. Then finally i loaded the temp directories and merged in as a main full
indexed directories. 

4. I have used threading too for this purpose.

5. This some what removed the optimize() overhead of IndexWriter, as i added
directories together only at the end.

Am i doing this the right way or not, is there any other solution to boost
the indexing process. 

2. Problem in searching

As lucene doesnt support LSI and SVD so as to achieve conceptual search, i
first search the lucene index for the user inputted text then retrieved the
document and then expanded the query using LSI and SVD and then re-searched
the index. 
Now with few words in query doesnt seem to have performance problem but when
i expand the query i.e. when the query contains ten words ORed together then
it takes tremendously unacceptable amount of time to get Hits. Is this
obvious or am i missing something here too.. 
What are the ways to achieve boost in query performance when the query
contains many terms and especially they are ORed, as for ANDed query it
requires less time to produce Hits.
I have used single Indexsearcher and my index is optimized as well..

3. Another Problem

As i require to dump my database table too inside lucene along with fulltext
What effect will it have on indexing and searching.?
Also i might need to change the name of Field of Document indexed in lucene,
will it be possible.?
I know its not possible to change the value of the field but will it be
possible to change the name of the field or we have to control externally..? 

Please shade me some light in these things:
Your help is highly anticipated

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message