lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Bruen <br...@coldrain.net>
Subject Re: [lucy-user] Large index sizes
Date Thu, 25 Apr 2013 13:16:34 GMT

Hi,

I have indexed millions of files, ending up with a 127G index file, which 
works fine. There are enough resources for this.

I also tried to do the same with 10s of millions, but the indexing process 
never could finish, even with enough resources (index file ~400G). It 
kept updating one file a tiny bit every few minutes. I think I could do a 
better job in the code, but I have not been able to get back to it yet.

             -bob


On Thu, 25 Apr 2013, Edwin Crockford wrote:

> Have recently built started to use Lucy (with Perl) and everything went well 
> until I tried to index a large file store (>300,000 files). The indexer 
> process reached >8Bbytes and the machine ran out of resources. My questions 
> are:
>
> a) Is this the normal resources requirements?
>
> b) Is there a way to avoid swamping machines?
>
> I also found that the searcher becomes very large for large indexes and as 
> ours runs as a part of a FastCGI process it exceeded the ulimit of the 
> process. Upping the ulimit fixed this, but diagnosing the issue was difficult 
> as the query would just return 0 results rather than indicating that it had 
> run out of procees space.
>
> Many thanks
>
> Edwin Crockford
>

-- 
Dr. Robert Bruen
Cold Rain Labs
http://coldrain.net/bruen
+1.802.579.6288


Mime
View raw message