lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: drastic performance decrease with 20 cores
Date Mon, 26 Sep 2011 16:33:08 GMT
On 9/26/2011 9:33 AM, Bictor Man wrote:
> Hi everyone,
>
> Sorry if this issue has been discussed before, but I'm new to the list.
>
> I have a solr (3.4) instance running with 20 cores (around 4 million docs
> each).
> The instance has allocated 13GB in a 16GB RAM server. If I run several sets
> of queries sequentially in each of the cores, the I/O access goes very high,
> so does the system load, while the CPU percentage remains low.
> It takes almost 1 hour to complete the set of queries.
>
> If I stop solr and restart it with 6GB allocated and 10 cores, after a bit
> the I/O access goes down and the CPU goes up, taking only around 5 minutes
> to complete all sets of queries.

With 13 of your 16GB of RAM being gobbled up by the Java process running 
Solr, and some of your memory taken up by the OS itself, you've probably 
only got about 2GB of free RAM left for the OS disk cache.  Not knowing 
what kind of data you're indexing, I can only guess how big your indexes 
are, but with around 80 million total documents, I imagine that it is 
MUCH larger than 2GB.

If I'm right, this means that your Solr server is unable to keep index 
data in RAM, so it ends up going out to the disk every time it needs to 
make a query, and that is SLOW.  The ideal situation is to have enough 
free memory so that the OS can put all index data into its disk cache, 
making access to it nearly instantaneous.  You may never reach that 
ideal with your setup, but if you can get between a third and half the 
index into RAM, it'll probably still perform well.

Do you really need to allocate 13GB to Solr?  If it crashes when you 
allocate less, you may have very large Solr caches in in solrconfig.xml 
that you can reduce.  You do want to take advantage of Solr caching, but 
if you have to choose between disk caching and Solr caching, go for disk.

It's unusual, but not necessarily wrong, to have so many large cores on 
one machine.  Why are things set up that way?  Are you using a 
distributed index, or do you have 20 separate indexes?

The bottom line - you need more memory.  Running with 32GB or even 64GB 
would probably serve you very well.  You probably also need more 
machines.  For redundancy purposes, you'll want to have two complete 
copies of your index on separate hardware and some kind of load balancer 
with failover capability.  You may also want to look into increasing 
your I/O speed, with 15k RPM SAS drives, RAID10, or even SSD.

Depending on the needs of your application, you may be able to decrease 
your index size by changing your schema and re-indexing, especially in 
the area of stored fields.  Typically what you want to do is store only 
the data required to construct a search results grid, and go to the 
original data source for full details when someone opens a specific 
result.  You can also look into changing the field types on your index 
to remove Lucene features you don't need.

The needs of every Solr installation are different, and even my advice 
might be wrong for your particular setup, but you can rarely go wrong by 
adding memory.

Thanks,
Shawn


Mime
View raw message