kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Bejeck <b...@confluent.io>
Subject Re: Kafka streams (2.1.1) - org.rocksdb.RocksDBException:Too many open files
Date Fri, 12 Jul 2019 15:40:26 GMT
Hi Thameem,

> Our topology has 3100 tasks with 3 brokers and 8 partitions.

I have a question that is somewhat orthogonal to your original item.  I may
be missing something, but are you saying your streams application has 3,100
tasks from 8 input partitions? If that is the case, do you mind sharing
your topology? It seems like you have several repartition operations and
you may be able to reduce the number of repartition operations depending on
what you are doing.

Thanks,
Bill

On Thu, Jul 11, 2019 at 11:24 PM Thameem Ansari <thameema@gmail.com> wrote:

> Open files in the system was set to 10Million and users limit is 1million.
> When the process was active i was closely watching the open files and it
> was around 400K so its well within the set limit. Rocksdb open files we
> tried setting 100, 50, -1 but the results are same.
> I am using Rocksdb config setter and not options (not sure what it is)
> file.
>
> Quick question: Our topology has 3100 tasks with 3 brokers and 8
> partitions. But we are using only one streaming application instance with
> single thread. Does this cause any issue? Today i am going to try with 2 or
> 3 instances and see whether we can get rid of the issue.
>
> Thanks
> Thameem
>
>
> > On Jul 12, 2019, at 1:57 AM, Sophie Blee-Goldman <sophie@confluent.io>
> wrote:
> >
> > I believe the "resource temporarily unavailable" actually is related to
> the
> > open files, most likely you are hitting the total file descriptor limit.
> > Sorry if you mentioned this and I missed it, but what was the
> > max.open.files in your RocksDBConfigSetter when you ran this? Actually,
> > could you just include your full RocksDBConfigSetter implementation?
> >
> > Not sure why Rocks would be spawning so many child processes...it does
> use
> > background threads for compaction but this sounds like something totally
> > different. I also notice a read "This is a RocksDB options file" -- are
> you
> > trying to use an options file vs setting the options with the config
> setter
> > (or otherwise using Rocksdb outside of Streams?)
> >
> > Have you tried cleaning up the state between runs? Are you using
> > interactive queries?
> >
> > Regarding the .checkpoint file not found thing -- this is an annoying but
> > pretty much harmless bug we only recently figured out. There's an open PR
> > for it but for now you can just ignore the warning. See KAFKA-5998
> > <https://issues.apache.org/jira/browse/KAFKA-5998 <
> https://issues.apache.org/jira/browse/KAFKA-5998>> for a long history (but
> > scroll to the bottom for the actual explanation)
> >
> >
> > On Thu, Jul 4, 2019 at 11:02 AM Thameem Ansari <thameema@gmail.com
> <mailto:thameema@gmail.com>> wrote:
> >
> >> i have few more details to share from today’s testing.
> >>
> >> Attached strace to the process and noticed that there are thousands of
> >> short lived child processes have been created by the stream application.
> >> Not sure whether rocksdb is playing any role here. Noticed more than
> 73000
> >> child processes were created and this is after i increased the default
> >> linux system limit from 32767 to 4million per PID. Appreciate if someone
> >> answer the following questions.
> >> 1. Why rocksdb creates these many processes? Each process is trying to
> >> read the contents of the file and getting EAGAIN (Resource temporarily
> >> unavailable)
> >> 2. Noticed very high number of .checkpoint files missing and hence “No
> >> such file or directory” message
> >> 3. The child processes were created using “clone” system call. Not sure
> >> whether there is any significance of this but wanna mention. Moreover
> these
> >> child processes were keep on increasing when the load is getting applied
> >> continuously. Not sure under what condition it will hit the ceiling.
> >> 4. Noticed very high VIRT memory usage (around 31G) but RES memory was
> >> only 1.9G. Isn’t every file open/close consume memory? But it is not
> going
> >> up even though the number of processes is increased.
> >>
> >> Thanks
> >> Thameem
> >>
> >>> On Jul 4, 2019, at 9:28 AM, Thameem Ansari <thameema@gmail.com> wrote:
> >>>
> >>> Sorry for reposting the previous message as the images didn’t come
> >> thru.. pasting as text.
> >>>
> >>> I have changed both system and user limits.
> >>>
> >>> To completely isolate the problem, I have tried the application in
> >> Centos 7 environment. Set the ulimit to 1million and system limits to
> >> 10million open files. Now 3 kafka nodes are running in separate servers
> >> each and streaming application is running in a dedicated VM. Now the
> >> application is not explicitly throwing “too many open files” error but
> >> automatically aborted with the message
> >>>
> >>> terminate called after throwing an instance of 'std::system_error'
> >>>  what():  Resource temporarily unavailable
> >>> ./bin/start.sh: line 42:  1100 Aborted
> >>>
> >>> Here is the last few lines from strace output which shows the aborted
> >> message.
> >>>
> >>> 25721 14:58:35
> >>
> open("/home/devops/data/kafka-streams/RawLog_Processor/393_7/MAIL.INCOMING_FILTER_BY_USER_MAIL_INCOMING_DOMAIN_TRAFFIC_DETAIL.DETECTED_CONTENT_FILTER_RECIPIENT_DOMAIN/MAIL.INCOMING_FILTER_BY_USER_MAIL_INCOMING_DOMAIN_TRAFFIC_DETAIL.DETECTED_CONTENT_FILTER_RECIPIENT_DOMAIN.1555200000/MANIFEST-000007",
> >> O_RDONLY|O_CLOEXEC) = 12505
> >>> 25721 14:58:35
> >> open("/sys/devices/virtual/block/dm-2/queue/logical_block_size",
> O_RDONLY)
> >> = 12506
> >>> 25721 14:58:35 read(12506, "512\n", 4096) = 4
> >>> 25721 14:58:35 close(12506)             = 0
> >>> 25721 14:58:35 write(12502, "s.advise_random_on_open: 0\n2019/"...,
> >> 4096) = 4096
> >>> 25721 14:58:35 write(12502, "ions.comparator: leveldb.Bytewis"...,
> 4096)
> >> = 4096
> >>> 25721 14:58:35 read(12505,
> >> "V\371\270\370\34\0\1\1\32leveldb.BytewiseCompara"..., 32768) = 192
> >>> 25721 14:58:35 read(12505, "", 28672)   = 0
> >>> 25721 14:58:35 close(12505)             = 0
> >>> 17701 14:58:35
> >>
> open("/home/devops/data/kafka-streams/RawLog_Processor/393_7/MAIL.INCOMING_FILTER_BY_USER_MAIL_INCOMING_DOMAIN_TRAFFIC_DETAIL.DETECTED_CONTENT_FILTER_RECIPIENT_DOMAIN/MAIL.INCOMING_FILTER_BY_USER_MAIL_INCOMING_DOMAIN_TRAFFIC_DETAIL.DETECTED_CONTENT_FILTER_RECIPIENT_DOMAIN.1555200000/000006.sst",
> >> O_RDONLY|O_CLOEXEC) = 12505
> >>> 17702 14:58:35 +++ exited with 0 +++
> >>> 25721 14:58:35 write(2, "terminate called after throwing "..., 48) = 48
> >>> 25721 14:58:35 write(2, "std::system_error", 17) = 17
> >>> 25721 14:58:35 write(2, "'\n", 2)       = 2
> >>> 25721 14:58:35 write(2, "  what():  ", 11) = 11
> >>> 25721 14:58:35 write(2, "Resource temporarily unavailable", 32) = 32
> >>> 25721 14:58:35 write(2, "\n", 1)        = 1
> >>> 17701 14:58:35
> >> open("/sys/devices/virtual/block/dm-2/queue/logical_block_size",
> O_RDONLY)
> >> = 12506
> >>> 25721 14:58:35 write(12502, "ction_dynamic_level_bytes: 0\n201"...,
> >> 3672) = 3672
> >>> 25721 14:58:35 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL,
> >> si_pid=25697, si_uid=1000} ---
> >>> 17701 14:58:35 read(12506,  <unfinished ...>
> >>> 17701 14:58:36 +++ killed by SIGABRT +++
> >>> 17700 14:58:36 +++ killed by SIGABRT +++
> >>> 17699 14:58:36 +++ killed by SIGABRT +++
> >>>
> >>>
> >>> As I can see from the open files they are way lower (45201) than the
> max
> >> limit and hence can we rule out the “open files” is the root cause?
> >>>
> >>> I have also noticed there is lot of EAGAIN (count 3334) and associated
> >> “Resource temporarily unavailable” messages as well
> >>>
> >>>
> >>>
> >>> 25732 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25732 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25732 14:49:23 close(59)                = 0
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25721 14:49:23 write(50, "\0", 1)       = 1
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25721 14:49:23 read(17,
> >> "PK\3\4\n\0\0\10\10\0\343SHNM\n4>Q\2\0\0\265\5\0\0A\0\0\0", 30) = 30
> >>> 25721 14:49:23 read(17,
> >>
> "\245T]O\23A\24=\323\226\326-\253\255\250\365\v\4\265ji\221E\344\305\200M\f\201\304\244j"...,
> >> 593) = 593
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25732 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25732 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25732 14:49:23 close(59)                = 0
> >>> 25721 14:49:23 read(49, "\0", 16)       = 1
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25732 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25732 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25732 14:49:23 close(59)                = 0
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25721 14:49:23 read(54, "\0\1dx", 4)    = 4
> >>> 25721 14:49:23 read(54,
> >> "\0\0\0\v\0\0\0\0\0\0\0\1dj\0\0\0\0\1\212\0\201RawLog_Pro"..., 91256) =
> >> 91256
> >>> 25721 14:49:23 read(54, 0x7f7044003b80, 4) = -1 EAGAIN (Resource
> >> temporarily unavailable)
> >>> 25732 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25732 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25721 14:49:23 write(35, "2019-07-03 14:49:23.258 [rawLP1-"..., 219
> >> <unfinished ...>
> >>> 25732 14:49:23 close(59)                = 0
> >>> 25721 14:49:23 <... write resumed> )    = 219
> >>> 25721 14:49:23 write(1, "2019-07-03 14:49:23.258 [rawLP1-"..., 219) =
> 219
> >>> 25721 14:49:23 write(50, "\0", 1)       = 1
> >>> 25721 14:49:23 read(49, "\0", 16)       = 1
> >>> 25708 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 60
> >>> 25708 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25708 14:49:23 close(59)                = 0
> >>> 25709 14:49:23 read(60, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(60)                = 0
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25708 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25708 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25708 14:49:23 close(59)                = 0
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25709 14:49:23 close(59)                = 0
> >>> 25732 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25732 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>> 25732 14:49:23 close(59)                = 0
> >>> 25709 14:49:23 open("/sys/fs/cgroup/memory/memory.limit_in_bytes",
> >> O_RDONLY) = 59
> >>> 25709 14:49:23 read(59, "9223372036854775807\n", 4096) = 20
> >>>
> >>> But unable to come to any conclusion from these findings. Any insight
> is
> >> appreciated.
> >>>
> >>> Thanks
> >>> Thameem
> >>>
> >>>> On Jul 4, 2019, at 9:18 AM, Thameem Ansari <thameema@gmail.com
> <mailto:thameema@gmail.com> <mailto:
> >> thameema@gmail.com <mailto:thameema@gmail.com>>> wrote:
> >>>>
> >>>> <strace2.png>
> >>>>
> >>>>
> >>>> But unable to come to any conclusion from these findings. Any insight
> >> is appreciated.
> >>>>
> >>>> Thanks
> >>>> Thameem
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message