kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Solon Gordon <so...@knewton.com>
Subject Re: OutOfMemoryException when starting replacement node.
Date Wed, 10 Dec 2014 16:41:05 GMT
Thanks for your help. We do have replica.fetch.max.bytes set to 10MB to
allow larger messages, so perhaps that's related. But should that really be
big enough to cause OOMs on an 8GB heap? Are there other broker settings we
can tune to avoid this issue?

On Wed, Dec 10, 2014 at 11:05 AM, Gwen Shapira <gshapira@cloudera.com>
wrote:

> There is a parameter called replica.fetch.max.bytes that controls the
> size of the messages buffer a broker will attempt to consume at once.
> It defaults to 1MB, and has to be at least message.max.bytes (so at
> least one message can be sent).
>
> If you try to support really large messages and increase these values,
> you may run into OOM issues.
>
> Gwen
>
> On Wed, Dec 10, 2014 at 7:48 AM, Solon Gordon <solon@knewton.com> wrote:
> > I just wanted to bump this issue to see if anyone has thoughts. Based on
> > the error message it seems like the broker is attempting to consume
> nearly
> > 2GB of data in a single fetch. Is this expected behavior?
> >
> > Please let us know if more details would be helpful or if it would be
> > better for us to file a JIRA issue. We're using Kafka 0.8.1.1.
> >
> > Thanks,
> > Solon
> >
> > On Thu, Dec 4, 2014 at 12:00 PM, Dmitriy Gromov <dmitriy@knewton.com>
> wrote:
> >
> >> Hi,
> >>
> >> We were recently trying to replace a broker instance and were getting an
> >> OutOfMemoryException when the new node was coming up. The issue happened
> >> during the log replication phase. We were able to circumvent this issue
> by
> >> copying over all of the logs to the new node before starting it.
> >>
> >> Details:
> >>
> >> - The heap size on the old and new node was 8GB.
> >> - There was about 50GB of log data to transfer.
> >> - There were 1548 partitions across 11 topics
> >> - We recently increased our num.replica.fetchers to solve the problem
> >> described here: https://issues.apache.org/jira/browse/KAFKA-1196.
> However,
> >> this process worked when we first changed that value.
> >>
> >> [2014-12-04 12:10:22,746] ERROR OOME with size 1867671283
> (kafka.network.
> >> BoundedByteBufferReceive)
> >> java.lang.OutOfMemoryError: Java heap space
> >>   at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> >>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
> >>   at kafka.network.BoundedByteBufferReceive.byteBufferAllocate(
> >> BoundedByteBufferReceive.scala:80)
> >>   at kafka.network.BoundedByteBufferReceive.readFrom(
> >> BoundedByteBufferReceive.scala:63)
> >>   at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> >>   at kafka.network.BoundedByteBufferReceive.readCompletely(
> >> BoundedByteBufferReceive.scala:29)
> >>   at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> >>   at
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73)
> >>   at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$
> >> $sendRequest(SimpleConsumer.scala:71)
> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> >> apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109)
> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> >> apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
> >> apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
> >>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(
> >> SimpleConsumer.scala:108)
> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(
> >> SimpleConsumer.scala:108)
> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(
> >> SimpleConsumer.scala:108)
> >>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
> >>   at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)
> >>   at kafka.server.AbstractFetcherThread.processFetchRequest(
> >> AbstractFetcherThread.scala:96)
> >>   at
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:
> >> 88)
> >>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>
> >> Thank you
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message