lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manohar Sripada <manohar...@gmail.com>
Subject Re: How to select the correct number of Shards in SolrCloud
Date Fri, 16 Jan 2015 12:46:34 GMT
Thanks Daniel and Shawn for your valuable suggestions,

Daniel,
If you have a query and it needs to get results from 64 cores, if 63 return
in 100ms but the last core is in GC pause and takes 500ms, your query will
take just over 500ms.
> There is only single JVM running per machine. I will get the QTime from
each Solr Core and will check if this is the root cause.

Lastly, you mentioned you allocated 32Gb to "solr", do you mean to the
JVM heap?
That's quite a lot of a 64Gb machine, you haven't left much for the page
cache.
> Yes, 32GB to Solr's JVM heap. I wanted to enable Filter & FieldValue
Cache, as most of my search queries revolves around filters and facets.
Also, I am planning to use Document cache.

Shawn,
Each server has 8 CPU cores and 64GB of RAM.  Solr requires a 6GB heap
> Can you please tell me what is the size of your index? And what is the
size of the large cold shard?
> Can you please suggest if any tool that you use for collecting the
statistics? like the QTime's for the queries etc.

Thanks,
Manohar


On Fri, Jan 16, 2015 at 3:23 PM, Shawn Heisey <apache@elyograg.org> wrote:

> On 1/15/2015 10:58 PM, Manohar Sripada wrote:
> > The reason I have created 64 Shards is there are 4 CPU cores on each VM;
> > while querying I can make use of all the CPU cores. On an average, Solr
> > QTime is around 500ms here.
> >
> > Last time to my other discussion, Erick suggested that I might be over
> > sharding, So, I tried reducing the number of shards to 32 and then 16. To
> > my surprise, it started performing better. It came down to 300 ms (for 32
> > shards) and 100 ms (for 16 shards). I haven't tested with filters and
> > facets yet here. But, the simple search queries had shown lot of
> > improvement.
> >
> > So, how come the less number of shards performing better?? Is it because
> > there are less number of posting lists to search on OR less merges that
> are
> > happening? And how to determine the correct number of shards?
>
> Daniel has replied with good information.
>
> One additional problem I can think of when there are too many shards: If
> your Solr server is busy enough to have any possibility of simultaneous
> requests, then you will find that it's NOT a good idea to create enough
> shards to use all your CPU cores.  In that situation, when you do a
> single query, all your CPU cores will be in use.  When multiple queries
> happen at the same time, they have to share the available CPU resources,
> slowing them down.  With a smaller number of shards, the additional CPU
> cores can handle simultaneous queries.
>
> I have an index with nearly 100 million documents.  I've divided it into
> six large cold shards and one very small hot shard.  It's not SolrCloud.
>  I put three large shards on each of two servers, and the small shard on
> one of those two servers.  The distributed query normally happens on the
> server without the small shard.  Each server has 8 CPU cores and 64GB of
> RAM.  Solr requires a 6GB heap.
>
> My median QTime over the last 231836 queries is 25 milliseconds and my
> 95th percentile QTime is 376 milliseconds.  My query rate is pretty low
> - I've never seen Solr's statistics for the 15 minute query rate go
> above a single digit per second.
>
> Thanks,
> Shawn
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message