lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganesh" <>
Subject Re: Re: Scale up design
Date Thu, 16 Dec 2010 05:59:36 GMT

Thanks for your information.

My current stats:
250 GB of data, 40 GB of Index Size, 60 million records is working fine with 1 GB RAM. We
are storing minmal amount of data in index. We are doing sorting on Date. Even in single system,
the database are shard. 

We are planning to build hosted solution. This stats will increase by minimum 10 times in
2 - 3 years. I plan to use 64 Bit, with 8 - 10 GB RAM allocated to JVM. Will i face any issues?
Its good to scale up as it would be easy to backup and maintanance. If i scale out then i
may require atleast 5 system. 

Any thoughts?


----- Original Message ----- 
From: "Toke Eskildsen" <>
To: <>
Sent: Wednesday, December 15, 2010 4:36 PM
Subject: [Bulk] Re: Scale up design

> On Wed, 2010-12-15 at 09:42 +0100, Ganesh wrote:
>> What is the advantage of going for 64 Bit.
> Larger maximum heap, more memory in the machine.
>> People claim performance and usage of more RAM.
> Yes, pointers normally take up 64bit on a 64bit machine. Depending on
> the application, the overhead can be anything from practically
> non-existing to close to 100%. You can set an option for the JVM to try
> and use smaller pointers on 64bit machines. This limits the maximum
> memory allocation in the JVM to 32GB, which seems like a fair compromise
> at this point in time.
>> In 32 Bit OS, JVM handles 1 to 1.5 GB of RAM then in case
>> of 64 Bit, Single JVM cannot use more than 1.5 GB RAM?
> Say what? When running on a 64bit, the JVM heap limit is normally the
> system's per-process memory limit. For Linux this is generally well
> above any real world hardware. For Windows it seems like you need to
> enable something:
> (note: I have no experience with 64bit Windows)
>> Please help me with some more ideas. We need to design whether
>> to scale out or scale up.
> Maybe you could describe your vision in more detail? What scale are you
> looking at? How large is your index in GB, how many documents, how fast
> do you need the searcher to respond, are you doing any sorting or
> faceting (and do you facet on a few unique values or things like title
> or author)?
> It makes little sense to try and get a single machine to handle billions
> of documents with large faceting, but it seems silly to distribute 10GB
> of index with 1 million documents. As a general rule of thumb. As always
> your mileage might wary.
> For the record, our current index is 40GB/9 million records. We're doing
> sorting on title and faceting on 15 fields, out of which 2 has 4-6
> million unique values. This runs on a single machine (okay, 2, but they
> are mirrored) with 6GB of RAM and it works fine with sub-second response
> times (normally <300ms AFAIR). Our experimental setup can get by with
> 1.2GB and would thus not require 64bit.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message