kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shikhar Bhushan <shik...@confluent.io>
Subject Re: broker randomly shuts down
Date Fri, 01 Jul 2016 18:42:13 GMT
This is somewhat specific to your runtime environment, you can check out
whatever script is getting used for bringing up Kafka, and where the stderr
of the java command is being redirected (hopefully not /dev/null!).

On Thu, Jun 30, 2016 at 5:24 PM allen chan <allen.michael.chan@gmail.com>
wrote:

> Hi Shikhar,
> I do not see stderr log file anywhere. Can you point me to where kafka
> would write such a file?
>
> On Thu, Jun 30, 2016 at 5:10 PM, Shikhar Bhushan <shikhar@confluent.io>
> wrote:
>
> > Perhaps it's a JVM crash? You might not see anything in the standard
> > application-level logs, you'd need to look for the stderr.
> >
> > On Thu, Jun 30, 2016 at 5:07 PM allen chan <allen.michael.chan@gmail.com
> >
> > wrote:
> >
> > > Anyone else have ideas?
> > >
> > > This is still happening. I moved off zookeeper from the server to its
> own
> > > dedicated VMs.
> > > Kakfa starts with 4G of heap and gets nowhere near that much consumed
> > when
> > > it crashed.
> > > i bumped up the zookeeper timeout settings but that has not solved it.
> > >
> > > I also disconnected all the producers and consumers. This point
> something
> > > between kafka and zookeeper right?
> > >
> > > Again logs are no help as to why kafka decided to shut itself down
> > > https://gist.github.com/allenmchan/f9331e54bb4fd77cc5bc0b031a7a6206
> > >
> > >
> > >
> > >
> > > On Thu, Jun 2, 2016 at 4:22 PM, Russ Lavoie <russlavoie@gmail.com>
> > wrote:
> > >
> > > > What about in dmesg?  I have run into this issue and it was the OOM
> > > > killer.  I also ran into a heap issue using too much of the direct
> > memory
> > > > (JVM).  Reducing the fetcher threads helped with that problem.
> > > > On Jun 2, 2016 12:19 PM, "allen chan" <allen.michael.chan@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Tom,
> > > > >
> > > > > That is one of the first things that i checked. Active memory never
> > > goes
> > > > > above 50% of overall available. File cache uses the rest of the
> > memory
> > > > but
> > > > > i do not think that causes OOM killer.
> > > > > Either way there is no entries in /var/log/messages (centos) to
> show
> > > OOM
> > > > is
> > > > > happening.
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Thu, Jun 2, 2016 at 5:36 AM, Tom Crayford <tcrayford@heroku.com
> >
> > > > wrote:
> > > > >
> > > > > > That looks like somebody is killing the process. I'd suspect
> either
> > > the
> > > > > > linux OOM killer or something else automatically killing the
JVM
> > for
> > > > some
> > > > > > reason.
> > > > > >
> > > > > > For the OOM killer, assuming you're on ubuntu, it's pretty easy
> to
> > > find
> > > > > in
> > > > > > /var/log/syslog (depending on your setup). I don't know about
> other
> > > > > > operating systems.
> > > > > >
> > > > > > On Thu, Jun 2, 2016 at 5:54 AM, allen chan <
> > > > allen.michael.chan@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I have an issue where my brokers would randomly shut itself
> down.
> > > > > > > I turned on debug in log4j.properties but still do not
see a
> > reason
> > > > why
> > > > > > the
> > > > > > > shutdown is happening.
> > > > > > >
> > > > > > > Anyone seen this behavior before?
> > > > > > >
> > > > > > > version 0.10.0
> > > > > > > log4j.properties
> > > > > > >     log4j.rootLogger=DEBUG, kafkaAppender
> > > > > > > * I tried TRACE level but i do not see any additional log
> > messages
> > > > > > >
> > > > > > > snippet of log around shutdown
> > > > > > > [2016-06-01 15:11:51,374] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:11:53,376] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:11:55,377] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:11:57,380] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:11:59,383] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:01,386] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:03,389] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager
on
> Broker
> > > 2]:
> > > > > > > Removed 0 expired offsets in 0 milliseconds.
> > > > > > > (kafka.coordinator.GroupMetadataManager)
> > > > > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager
on
> Broker
> > > 2]:
> > > > > > > Removed 0 expired offsets in 0 milliseconds.
> > > > > > > (kafka.coordinator.GroupMetadataManager)
> > > > > > > [2016-06-01 15:12:05,390] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:07,393] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:09,396] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:11,399] DEBUG Got ping response for
> sessionid:
> > > > > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting
down
> > > > > > > (kafka.server.KafkaServer)
> > > > > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting
down
> > > > > > > (kafka.server.KafkaServer)
> > > > > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting
> > > controlled
> > > > > > > shutdown (kafka.server.KafkaServer)
> > > > > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting
> > > controlled
> > > > > > > shutdown (kafka.server.KafkaServer)
> > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > > > > connections-closed:
> > > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > > > > connections-created:
> > > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > > > > bytes-sent-received:
> > > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > bytes-sent:
> > > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name
> > > > bytes-received:
> > > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name
> > select-time:
> > > > > > > (org.apache.kafka.common.metrics.Metrics)
> > > > > > >
> > > > > > > --
> > > > > > > Allen Michael Chan
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Allen Michael Chan
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Allen Michael Chan
> > >
> >
>
>
>
> --
> Allen Michael Chan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message