kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From allen chan <allen.michael.c...@gmail.com>
Subject Re: broker randomly shuts down
Date Fri, 01 Jul 2016 00:24:15 GMT
Hi Shikhar,
I do not see stderr log file anywhere. Can you point me to where kafka
would write such a file?

On Thu, Jun 30, 2016 at 5:10 PM, Shikhar Bhushan <shikhar@confluent.io>
wrote:

> Perhaps it's a JVM crash? You might not see anything in the standard
> application-level logs, you'd need to look for the stderr.
>
> On Thu, Jun 30, 2016 at 5:07 PM allen chan <allen.michael.chan@gmail.com>
> wrote:
>
> > Anyone else have ideas?
> >
> > This is still happening. I moved off zookeeper from the server to its own
> > dedicated VMs.
> > Kakfa starts with 4G of heap and gets nowhere near that much consumed
> when
> > it crashed.
> > i bumped up the zookeeper timeout settings but that has not solved it.
> >
> > I also disconnected all the producers and consumers. This point something
> > between kafka and zookeeper right?
> >
> > Again logs are no help as to why kafka decided to shut itself down
> > https://gist.github.com/allenmchan/f9331e54bb4fd77cc5bc0b031a7a6206
> >
> >
> >
> >
> > On Thu, Jun 2, 2016 at 4:22 PM, Russ Lavoie <russlavoie@gmail.com>
> wrote:
> >
> > > What about in dmesg?  I have run into this issue and it was the OOM
> > > killer.  I also ran into a heap issue using too much of the direct
> memory
> > > (JVM).  Reducing the fetcher threads helped with that problem.
> > > On Jun 2, 2016 12:19 PM, "allen chan" <allen.michael.chan@gmail.com>
> > > wrote:
> > >
> > > > Hi Tom,
> > > >
> > > > That is one of the first things that i checked. Active memory never
> > goes
> > > > above 50% of overall available. File cache uses the rest of the
> memory
> > > but
> > > > i do not think that causes OOM killer.
> > > > Either way there is no entries in /var/log/messages (centos) to show
> > OOM
> > > is
> > > > happening.
> > > >
> > > > Thanks
> > > >
> > > > On Thu, Jun 2, 2016 at 5:36 AM, Tom Crayford <tcrayford@heroku.com>
> > > wrote:
> > > >
> > > > > That looks like somebody is killing the process. I'd suspect either
> > the
> > > > > linux OOM killer or something else automatically killing the JVM
> for
> > > some
> > > > > reason.
> > > > >
> > > > > For the OOM killer, assuming you're on ubuntu, it's pretty easy to
> > find
> > > > in
> > > > > /var/log/syslog (depending on your setup). I don't know about other
> > > > > operating systems.
> > > > >
> > > > > On Thu, Jun 2, 2016 at 5:54 AM, allen chan <
> > > allen.michael.chan@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > I have an issue where my brokers would randomly shut itself
down.
> > > > > > I turned on debug in log4j.properties but still do not see a
> reason
> > > why
> > > > > the
> > > > > > shutdown is happening.
> > > > > >
> > > > > > Anyone seen this behavior before?
> > > > > >
> > > > > > version 0.10.0
> > > > > > log4j.properties
> > > > > >     log4j.rootLogger=DEBUG, kafkaAppender
> > > > > > * I tried TRACE level but i do not see any additional log
> messages
> > > > > >
> > > > > > snippet of log around shutdown
> > > > > > [2016-06-01 15:11:51,374] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:11:53,376] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:11:55,377] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:11:57,380] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:11:59,383] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:01,386] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:03,389] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager on Broker
> > 2]:
> > > > > > Removed 0 expired offsets in 0 milliseconds.
> > > > > > (kafka.coordinator.GroupMetadataManager)
> > > > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager on Broker
> > 2]:
> > > > > > Removed 0 expired offsets in 0 milliseconds.
> > > > > > (kafka.coordinator.GroupMetadataManager)
> > > > > > [2016-06-01 15:12:05,390] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:07,393] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:09,396] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:11,399] DEBUG Got ping response for sessionid:
> > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting down
> > > > > > (kafka.server.KafkaServer)
> > > > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting down
> > > > > > (kafka.server.KafkaServer)
> > > > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting
> > controlled
> > > > > > shutdown (kafka.server.KafkaServer)
> > > > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting
> > controlled
> > > > > > shutdown (kafka.server.KafkaServer)
> > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > > > connections-closed:
> > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > > > connections-created:
> > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > > > bytes-sent-received:
> > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> bytes-sent:
> > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name
> > > bytes-received:
> > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name
> select-time:
> > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > >
> > > > > > --
> > > > > > Allen Michael Chan
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Allen Michael Chan
> > > >
> > >
> >
> >
> >
> > --
> > Allen Michael Chan
> >
>



-- 
Allen Michael Chan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message