lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alice" <aliceli...@gmail.com>
Subject RE: Huge Index
Date Thu, 11 Jan 2007 17:16:51 GMT
I never got to index all the data but it is too slow. 
I got 3 million in 2,5 hours.

As suggested in Lucene in Action, I use ramDir and after I write 5000
documents I merge them to the fsDir.

The merge factor is now 100 I tried other variations but didn't make much
difference.

-----Original Message-----
From: Grant Ingersoll [mailto:gsingers@apache.org] 
Sent: quinta-feira, 11 de janeiro de 2007 15:07
To: java-user@lucene.apache.org
Subject: Re: Huge Index

Hi Alice,

Can you define slow (hours, days, months and on what hardware)?  Have  
you done any profiling, etc. to see where the bottlenecks are?  What  
size documents are you talking about here?  What are your merge  
factors, etc.?

Thanks,
Grant

On Jan 11, 2007, at 10:47 AM, Alice wrote:

> Hello!
>
>
>
> I have to index 37million documents retrieved from the database.
>
>
>
> I was trying to do by loading intervals of 10000 records but it is  
> too slow.
>
>
>
> Anybody could sugest a better way to get all the data indexed in a
> reasonable time?
>
>
>
> Thanks
>
>
>
>
>
>
>

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message