lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jim ferenczi <jim.feren...@gmail.com>
Subject Re: Very high memory and CPU utilization.
Date Mon, 02 Nov 2015 12:36:09 GMT
*I am not able to get  the above point. So when I start Solr with 28g RAM,
for all the activities related to Solr it should not go beyond 28g. And the
remaining heap will be used for activities other than Solr. Please help me
understand.*

Well those 28GB of heap are the memory "reserved" for your Solr
application, though some parts of the index (not to say all) are retrieved
via MMap (if you use the default MMapDirectory) which do not use the heap
at all. This is a very important part of Lucene/Solr, the heap should be
sized in a way that let a significant amount of RAM available for the
index. If not then you rely on the speed of your disk, if you have SSDs
it's better but reads are still significantly slower with SSDs than with
direct RAM access. Another thing to keep in mind is that mmap will always
tries to put things in RAM, this is why I suspect that you swap activity is
killing your performance.

2015-11-02 11:55 GMT+01:00 Modassar Ather <modather1981@gmail.com>:

> Thanks Jim for your response.
>
> The remaining size after you removed the heap usage should be reserved for
> the index (not only the other system activities).
> I am not able to get  the above point. So when I start Solr with 28g RAM,
> for all the activities related to Solr it should not go beyond 28g. And the
> remaining heap will be used for activities other than Solr. Please help me
> understand.
>
> *Also the CPU utilization goes upto 400% in few of the nodes:*
> You said that only machine is used so I assumed that 400% cpu is for a
> single process (one solr node), right ?
> Yes you are right that 400% is for single process.
> The disks are SSDs.
>
> Regards,
> Modassar
>
> On Mon, Nov 2, 2015 at 4:09 PM, jim ferenczi <jim.ferenczi@gmail.com>
> wrote:
>
> > *if it correlates with the bad performance you're seeing. One important
> > thing to notice is that a significant part of your index needs to be in
> RAM
> > (especially if you're using SSDs) in order to achieve good performance.*
> >
> > Especially if you're not using SSDs, sorry ;)
> >
> > 2015-11-02 11:38 GMT+01:00 jim ferenczi <jim.ferenczi@gmail.com>:
> >
> > > 12 shards with 28GB for the heap and 90GB for each index means that you
> > > need at least 336GB for the heap (assuming you're using all of it which
> > may
> > > be easily the case considering the way the GC is handling memory) and
> ~=
> > > 1TO for the index. Let's say that you don't need your entire index in
> > RAM,
> > > the problem as I see it is that you don't have enough RAM for your
> index
> > +
> > > heap. Assuming your machine has 370GB of RAM there are only 34GB left
> for
> > > your index, 1TO/34GB means that you can only have 1/30 of your entire
> > index
> > > in RAM. I would advise you to check the swap activity on the machine
> and
> > > see if it correlates with the bad performance you're seeing. One
> > important
> > > thing to notice is that a significant part of your index needs to be in
> > RAM
> > > (especially if you're using SSDs) in order to achieve good performance:
> > >
> > >
> > >
> > > *As mentioned above this is a big machine with 370+ gb of RAM and Solr
> > (12
> > > nodes total) is assigned 336 GB. The rest is still a good for other
> > system
> > > activities.*
> > > The remaining size after you removed the heap usage should be reserved
> > for
> > > the index (not only the other system activities).
> > >
> > >
> > > *Also the CPU utilization goes upto 400% in few of the nodes:*
> > > You said that only machine is used so I assumed that 400% cpu is for a
> > > single process (one solr node), right ?
> > > This seems impossible if you are sure that only one query is played at
> a
> > > time and no indexing is performed. Best thing to do is to dump stack
> > trace
> > > of the solr nodes during the query and to check what the threads are
> > doing.
> > >
> > > Jim
> > >
> > >
> > >
> > > 2015-11-02 10:38 GMT+01:00 Modassar Ather <modather1981@gmail.com>:
> > >
> > >> Just to add one more point that one external Zookeeper instance is
> also
> > >> running on this particular machine.
> > >>
> > >> Regards,
> > >> Modassar
> > >>
> > >> On Mon, Nov 2, 2015 at 2:34 PM, Modassar Ather <
> modather1981@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi Toke,
> > >> > Thanks for your response. My comments in-line.
> > >> >
> > >> > That is 12 machines, running a shard each?
> > >> > No! This is a single big machine with 12 shards on it.
> > >> >
> > >> > What is the total amount of physical memory on each machine?
> > >> > Around 370 gb on the single machine.
> > >> >
> > >> > Well, se* probably expands to a great deal of documents, but a huge
> > bump
> > >> > in memory utilization and 3 minutes+ sounds strange.
> > >> >
> > >> > - What are your normal query times?
> > >> > Few simple queries are returned with in a couple of seconds. But the
> > >> more
> > >> > complex queries with proximity and wild cards have taken more than
> 3-4
> > >> > minutes and some times some queries have timed out too where time
> out
> > is
> > >> > set to 5 minutes.
> > >> > - How many hits do you get from 'network se*'?
> > >> > More than a million records.
> > >> > - How many results do you return (the rows-parameter)?
> > >> > It is the default one 10. Grouping is enabled on a field.
> > >> > - If you issue a query without wildcards, but with approximately the
> > >> > same amount of hits as 'network se*', how long does it take?
> > >> > A query resulting in around half a million record return within a
> > couple
> > >> > of seconds.
> > >> >
> > >> > That is strange, yes. Have you checked the logs to see if something
> > >> > unexpected is going on while you test?
> > >> > Have not seen anything particularly. Will try to check again.
> > >> >
> > >> > If you are using spinning drives and only have 32GB of RAM in total
> in
> > >> > each machine, you are probably struggling just to keep things
> running.
> > >> > As mentioned above this is a big machine with 370+ gb of RAM and
> Solr
> > >> (12
> > >> > nodes total) is assigned 336 GB. The rest is still a good for other
> > >> system
> > >> > activities.
> > >> >
> > >> > Thanks,
> > >> > Modassar
> > >> >
> > >> > On Mon, Nov 2, 2015 at 1:30 PM, Toke Eskildsen <
> > te@statsbiblioteket.dk>
> > >> > wrote:
> > >> >
> > >> >> On Mon, 2015-11-02 at 12:00 +0530, Modassar Ather wrote:
> > >> >> > I have a setup of 12 shard cluster started with 28gb memory
each
> > on a
> > >> >> > single server. There are no replica. The size of index is
around
> > >> 90gb on
> > >> >> > each shard. The Solr version is 5.2.1.
> > >> >>
> > >> >> That is 12 machines, running a shard each?
> > >> >>
> > >> >> What is the total amount of physical memory on each machine?
> > >> >>
> > >> >> > When I query "network se*", the memory utilization goes upto
> 24-26
> > gb
> > >> >> and
> > >> >> > the query takes around 3+ minutes to execute. Also the CPU
> > >> utilization
> > >> >> goes
> > >> >> > upto 400% in few of the nodes.
> > >> >>
> > >> >> Well, se* probably expands to a great deal of documents, but a
huge
> > >> bump
> > >> >> in memory utilization and 3 minutes+ sounds strange.
> > >> >>
> > >> >> - What are your normal query times?
> > >> >> - How many hits do you get from 'network se*'?
> > >> >> - How many results do you return (the rows-parameter)?
> > >> >> - If you issue a query without wildcards, but with approximately
> the
> > >> >> same amount of hits as 'network se*', how long does it take?
> > >> >>
> > >> >> > Why the CPU utilization is so high and more than one core
is
> used.
> > >> >> > As far as I understand querying is single threaded.
> > >> >>
> > >> >> That is strange, yes. Have you checked the logs to see if something
> > >> >> unexpected is going on while you test?
> > >> >>
> > >> >> > How can I disable replication(as it is implicitly enabled)
> > >> permanently
> > >> >> as
> > >> >> > in our case we are not using it but can see warnings related
to
> > >> leader
> > >> >> > election?
> > >> >>
> > >> >> If you are using spinning drives and only have 32GB of RAM in
total
> > in
> > >> >> each machine, you are probably struggling just to keep things
> > running.
> > >> >>
> > >> >>
> > >> >> - Toke Eskildsen, State and University Library, Denmark
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message