lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukas Weiss <Lukas.We...@raiffeisen.it>
Subject High CPU usage with Solr 7.7.0
Date Wed, 27 Feb 2019 10:04:00 GMT
Hello,

we recently updated our Solr server from 6.6.5 to 7.7.0. Since then, we 
have problems with the server's CPU usage.
We have two Solr cores configured, but even if we clear all indexes and do 
not start the index process, we see 100 CPU usage for both cores.

Here's what our top says:

root@solr:~ # top
top - 09:25:24 up 17:40,  1 user,  load average: 2,28, 2,56, 2,68
Threads:  74 total,   3 running,  71 sleeping,   0 stopped,   0 zombie
%Cpu0  :100,0 us,  0,0 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
0,0 st
%Cpu1  :100,0 us,  0,0 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
0,0 st
%Cpu2  : 11,3 us,  1,0 sy,  0,0 ni, 86,7 id,  0,7 wa,  0,0 hi,  0,3 si, 
0,0 st
%Cpu3  :  3,0 us,  3,0 sy,  0,0 ni, 93,7 id,  0,3 wa,  0,0 hi,  0,0 si, 
0,0 st
KiB Mem :  8388608 total,  7859168 free,   496744 used,    32696 
buff/cache
KiB Swap:  2097152 total,  2097152 free,        0 used.  7859168 avail Mem 


  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND 
              P 
10209 solr      20   0 6138468 452520  25740 R 99,9  5,4  29:43.45 java 
-server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 
-XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 24 
10214 solr      20   0 6138468 452520  25740 R 99,9  5,4  28:42.91 java 
-server -Xms1024m -Xmx1024m -XX:NewRatio=3 -XX:SurvivorRatio=4 
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 
-XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 + 25

The solr server is installed on a Debian Stretch 9.8 (64bit) on Linux LXC 
dedicated Container.

Some more server info:

root@solr:~ # java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

root@solr:~ # free -m
              total        used        free      shared  buff/cache 
available
Mem:           8192         484        7675         701          31 7675
Swap:          2048           0        2048

We also found something strange if we do an strace of the main process, we 
get lots of ongoing connection timeouts:

root@solr:~ # strace -F -p 4136
strace: Process 4136 attached with 48 threads
strace: [ Process PID=11089 runs in x32 mode. ]
[pid  4937] epoll_wait(139,  <unfinished ...>
[pid  4936] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4909] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4618] epoll_wait(136,  <unfinished ...>
[pid  4576] futex(0x7ff61ce66474, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished 
...>
[pid  4279] futex(0x7ff61ce62b34, FUTEX_WAIT_PRIVATE, 2203, NULL 
<unfinished ...>
[pid  4244] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4227] futex(0x7ff56c71ae14, FUTEX_WAIT_PRIVATE, 2237, NULL 
<unfinished ...>
[pid  4243] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4228] futex(0x7ff5608331a4, FUTEX_WAIT_PRIVATE, 2237, NULL 
<unfinished ...>
[pid  4208] futex(0x7ff61ce63e54, FUTEX_WAIT_PRIVATE, 5, NULL <unfinished 
...>
[pid  4205] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4204] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4196] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4195] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4194] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4193] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4187] restart_syscall(<... resuming interrupted restart_syscall ...> 
<unfinished ...>
[pid  4180] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4179] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4177] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4174] accept(133,  <unfinished ...>
[pid  4173] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4172] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4171] restart_syscall(<... resuming interrupted restart_syscall ...> 
<unfinished ...>
[pid  4165] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4164] futex(0x7ff61c1f5054, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished 
...>
[pid  4163] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4162] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4161] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4160] futex(0x7ff623d52c20, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff 
<unfinished ...>
[pid  4159] futex(0x7ff61c1e9d54, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished 
...>
[pid  4158] futex(0x7ff61c1b7f54, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished 
...>
[pid  4157] futex(0x7ff61c1b5554, FUTEX_WAIT_PRIVATE, 19, NULL <unfinished 
...>
[pid  4156] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4155] restart_syscall(<... resuming interrupted futex ...> 
<unfinished ...>
[pid  4153] futex(0x7ff61c06c754, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished 
...>
[pid  4152] futex(0x7ff61c06ab54, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished 
...>
[pid  4151] futex(0x7ff61c068f54, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished 
...>
[pid  4150] futex(0x7ff61c067354, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished 
...>
[pid  4148] futex(0x7ff61c024a54, FUTEX_WAIT_PRIVATE, 403, NULL 
<unfinished ...>
[pid  4165] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection 
timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564856, tv_nsec=849859736}, 0xffffffff <unfinished ...>
[pid  4147] futex(0x7ff61c022e54, FUTEX_WAIT_PRIVATE, 415, NULL 
<unfinished ...>
[pid  4146] futex(0x7ff61c021254, FUTEX_WAIT_PRIVATE, 397, NULL 
<unfinished ...>
[pid  4145] futex(0x7ff61c01f654, FUTEX_WAIT_PRIVATE, 405, NULL 
<unfinished ...>
[pid  4144] futex(0x7ff61c00e354, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished 
...>
[pid  4136] futex(0x7ff624b729d0, FUTEX_WAIT, 4144, NULL <unfinished ...>
[pid  4165] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed 
out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564856, tv_nsec=900162344}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564856, tv_nsec=950365105}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=586325}, 0xffffffff) = -1 ETIMEDOUT (Connection 
timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=50791977}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=100997890}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=151206817}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=201402531}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=251616284}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=301813556}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=352036802}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=402239182}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=452439835}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=502635489}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=552844020}, 0xffffffff <unfinished ...>
[pid  4156] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection 
timed out)
[pid  4156] futex(0x7ff61c1aba28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4156] futex(0x7ff61c1aba54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564858, tv_nsec=506449064}, 0xffffffff <unfinished ...>
[pid  4165] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed 
out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=603013734}, 0xffffffff) = -1 ETIMEDOUT 
(Connection timed out)
[pid  4165] futex(0x7ff61c1f7a28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4165] futex(0x7ff61c1f7a54, FUTEX_WAIT_BITSET_PRIVATE, 1, 
{tv_sec=32564857, tv_nsec=653149664}, 0xffffffff^Cstrace: Process 4136 
detached
strace: Process 4144 detached
strace: Process 4145 detached
strace: Process 4146 detached
strace: Process 4147 detached
strace: Process 4148 detached
strace: Process 4150 detached
strace: Process 4151 detached
strace: Process 4152 detached
strace: Process 4153 detached
....


Could you help us to determine what's wrong with our setup?

Thank you very much,

Kind regards
Lukas Weiss
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message