kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <gshap...@cloudera.com>
Subject Re: OutOfMemoryException when starting replacement node.
Date Wed, 10 Dec 2014 16:48:13 GMT
If you have replica.fetch.max.bytes set to 10MB, I would not expect
2GB allocation in BoundedByteBufferReceive when doing a fetch.

Sorry, out of ideas on why this happens...

On Wed, Dec 10, 2014 at 8:41 AM, Solon Gordon <solon@knewton.com> wrote:
> Thanks for your help. We do have replica.fetch.max.bytes set to 10MB to
> allow larger messages, so perhaps that's related. But should that really be
> big enough to cause OOMs on an 8GB heap? Are there other broker settings we
> can tune to avoid this issue?
>
> On Wed, Dec 10, 2014 at 11:05 AM, Gwen Shapira <gshapira@cloudera.com>
> wrote:
>
>> There is a parameter called replica.fetch.max.bytes that controls the
>> size of the messages buffer a broker will attempt to consume at once.
>> It defaults to 1MB, and has to be at least message.max.bytes (so at
>> least one message can be sent).
>>
>> If you try to support really large messages and increase these values,
>> you may run into OOM issues.
>>
>> Gwen
>>
>> On Wed, Dec 10, 2014 at 7:48 AM, Solon Gordon <solon@knewton.com> wrote:
>> > I just wanted to bump this issue to see if anyone has thoughts. Based on
>> > the error message it seems like the broker is attempting to consume
>> nearly
>> > 2GB of data in a single fetch. Is this expected behavior?
>> >
>> > Please let us know if more details would be helpful or if it would be
>> > better for us to file a JIRA issue. We're using Kafka 0.8.1.1.
>> >
>> > Thanks,
>> > Solon
>> >
>> > On Thu, Dec 4, 2014 at 12:00 PM, Dmitriy Gromov <dmitriy@knewton.com>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> We were recently trying to replace a broker instance and were getting an
>> >> OutOfMemoryException when the new node was coming up. The issue happened
>> >> during the log replication phase. We were able to circumvent this issue
>> by
>> >> copying over all of the logs to the new node before starting it.
>> >>
>> >> Details:
>> >>
>> >> - The heap size on the old and new node was 8GB.
>> >> - There was about 50GB of log data to transfer.
>> >> - There were 1548 partitions across 11 topics
>> >> - We recently increased our num.replica.fetchers to solve the problem
>> >> described here: https://issues.apache.org/jira/browse/KAFKA-1196.
>> However,
>> >> this process worked when we first changed that value.
>> >>
>> >> [2014-12-04 12:10:22,746] ERROR OOME with size 1867671283
>> (kafka.network.
>> >> BoundedByteBufferReceive)
>> >> java.lang.OutOfMemoryError: Java heap space
>> >>   at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>> >>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
>> >>   at kafka.network.BoundedByteBufferReceive.byteBufferAllocate(
>> >> BoundedByteBufferReceive.scala:80)
>> >>   at kafka.network.BoundedByteBufferReceive.readFrom(
>> >> BoundedByteBufferReceive.scala:63)
>> >>   at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>> >>   at kafka.network.BoundedByteBufferReceive.readCompletely(
>> >> BoundedByteBufferReceive.scala:29)
>> >>   at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>> >>   at
>> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73)
>> >>   at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$
>> >> $sendRequest(SimpleConsumer.scala:71)
>> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
>> >> apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:109)
>> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
>> >> apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
>> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$
>> >> apply$mcV$sp$1.apply(SimpleConsumer.scala:109)
>> >>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(
>> >> SimpleConsumer.scala:108)
>> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(
>> >> SimpleConsumer.scala:108)
>> >>   at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(
>> >> SimpleConsumer.scala:108)
>> >>   at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>> >>   at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)
>> >>   at kafka.server.AbstractFetcherThread.processFetchRequest(
>> >> AbstractFetcherThread.scala:96)
>> >>   at
>> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:
>> >> 88)
>> >>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
>> >>
>> >> Thank you
>> >>
>>

Mime
View raw message