kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: Consumer not getting data when there is a big lag in the topic
Date Tue, 03 Feb 2015 02:31:33 GMT
Dinesh,

I took a look at your logs, first it seems error_logs_kafka_request.log is
also from the consumer side, not the server side.

And the error logs are pointing to an EOF on the server side while reading
the data, one possibility is that your socket buffer size is configured to
be not as large enough to fill in the fetch response data when the consumer
is catching up and hence getting a huge fetch response each time.

Guozhang

On Wed, Jan 28, 2015 at 12:28 PM, dinesh kumar <dinesh12b@gmail.com> wrote:

> Hi Guozhang,
> Sorry for the delayed response. We had some hardware issues and I was not
> able to give you the logs you asked for. We upgraded to 0.8.2-beta version
> of kafka
> in our cluster but that did not help solve the issue.
>
> To make more lucid, I am attaching a sample code that can be used to repro
> the issue on 0.8.1.1 version of  kafka (running locally is also fine).
> Also I have put the log files from my experiments in the attachment under
> logs/ directory.
>
> *logs/error_entires_consumer.log *- error entries that show up in the
> consumer logs
> *logs/error_logs_kafka_request.log *- error from the kafka-request.log in
> the server.
>
> To run the repro, please go through the attached README file.
>
> Let me know if you need any more info.
>
> Thanks,
> Dinesh
>
>
> On 15 January 2015 at 06:00, Guozhang Wang <wangguoz@gmail.com> wrote:
>
>> Could you check both the server logs and the consumer logs (with and
>> without the config specified) and see if there are any error entries /
>> exception logs?
>>
>> Guozhang
>>
>> On Wed, Jan 14, 2015 at 1:53 PM, dinesh kumar <dinesh12b@gmail.com>
>> wrote:
>>
>> > We don't have any compression on Kafka
>> >
>> > On 14 January 2015 at 22:54, Guozhang Wang <wangguoz@gmail.com> wrote:
>> >
>> > > Did you have compression enabled on Kafka?
>> > >
>> > > On Tue, Jan 13, 2015 at 10:33 AM, dinesh kumar <dinesh12b@gmail.com>
>> > > wrote:
>> > >
>> > > > We are using 0.8.1.1 version of Kafka and *not 0.8.2 *as mentioned
>> > above.
>> > > >
>> > > > Thanks,
>> > > > Dinesh
>> > > >
>> > > > On 13 January 2015 at 23:35, dinesh kumar <dinesh12b@gmail.com>
>> wrote:
>> > > >
>> > > > > Hi Guozhang,
>> > > > > Sorry for the misinformation. We have file sizes around 50 -
100
>> MB.
>> > So
>> > > > we
>> > > > > set *fetch.message.max.bytes* conservatively around 188743680.
>> Can
>> > you
>> > > > > please explain me the reason for this behavior?
>> > > > >
>> > > > > Thanks,
>> > > > > Dinesh
>> > > > >
>> > > > > On 13 January 2015 at 21:42, Guozhang Wang <wangguoz@gmail.com>
>> > wrote:
>> > > > >
>> > > > >> Dinesh,
>> > > > >>
>> > > > >> Your fetch.message.max.bytes is 188743680 < 155MB, but
you said
>> some
>> > > > >> messages can be as large as 180MB. Could you try to set it
to be
>> > > larger
>> > > > >> than, say 200MB and see if it helps?
>> > > > >>
>> > > > >> Guozhang
>> > > > >>
>> > > > >> On Tue, Jan 13, 2015 at 4:18 AM, dinesh kumar <
>> dinesh12b@gmail.com>
>> > > > >> wrote:
>> > > > >>
>> > > > >> > Hi,
>> > > > >> > I am been facing some JAVA high level consumer related
issues
>> > lately
>> > > > and
>> > > > >> > would like to understand more on this.
>> > > > >> >
>> > > > >> > We have 9 bare-metals (48 core, 250 GB, Terabytes of
Hard
>> disks)
>> > > > running
>> > > > >> > *Kafka
>> > > > >> > 0.8.2* and 5 independent VM (8 core, 60 GB) running
zookeeper.
>> > > > >> >
>> > > > >> > I have a topic that has key as metadata and value as
a file.
>> The
>> > > file
>> > > > >> can
>> > > > >> > be as large as *180 MB.* We have a topic with 90 partitions.
>> > > Sometimes
>> > > > >> > there will be only one consumer consuming from the topic.
When
>> the
>> > > > >> consumer
>> > > > >> > group for my topic has a *lag in the range of 200's*
and when I
>> > > start
>> > > > a
>> > > > >> > consumer (no other consumer running before) there is
*no data*
>> > > coming
>> > > > >> > through to the consumer.
>> > > > >> >
>> > > > >> > Please find below my consumer parameters.
>> > > > >> >
>> > > > >> > "zookeeper.connect"                => <zookeepers>,
>> > > > >> > "group.id"                         => "default",
>> > > > >> > "consumer.timeout.ms"              => "-1",
>> > > > >> > "auto.offset.reset"                => "smallest",
>> > > > >> > "auto.commit.enable"               => "false",
>> > > > >> > "consumer.timeout.ms"          => "-1",
>> > > > >> > "zookeeper.session.timeout.ms" => "100000",
>> > > > >> > "zookeeper.connection.timeout.ms"  => "6000",
>> > > > >> > "zookeeper.sync.time.ms"           => "2000",
>> > > > >> > "rebalance.backoff.ms"             =>  "20000",
>> > > > >> > "rebalance.max.retries"            => "50"
>> > > > >> > "fetch.message.max.bytes"      => "188743680",
>> > > > >> > "fetch.size"                   => "18874368"
>> > > > >> >
>> > > > >> >
>> > > > >> >
>> > > > >> > This problem occurs only when the *auto.offset.reset
*property
>> is
>> > > > >> > *smallest.
>> > > > >> > *I am able to get data if the offset is largest. I tried
using
>> the
>> > > > >> *console
>> > > > >> > consumer* for the same topic and consumer group with
>> > > > *--from-beginning*
>> > > > >> > option, I can see the data getting printed. I looked
into the
>> > > > >> > ConsoleConsumer code and I saw that there was no
>> > > > >> > *fetch.message.max.bytes *property
>> > > > >> > in the consumer option.
>> > > > >> >
>> > > > >> > So I removed the *fetch.message.max.bytes *from my code
and the
>> > > > consumer
>> > > > >> > started working but was throwing exception when the
message is
>> > > large.
>> > > > >> >
>> > > > >> > So *fetch.message.max.bytes *seemed to be the problem
but I
>> cannot
>> > > do
>> > > > >> > without it as my messages a big files. Can someone explain
to
>> me
>> > > what
>> > > > is
>> > > > >> > the issue here? I also adjusted the *fetch.size *parameter
>> > according
>> > > > to
>> > > > >> my
>> > > > >> > max message size but it did not help.
>> > > > >> >
>> > > > >> >
>> > > > >> > To summerize, I would like to understand what is happening
in
>> the
>> > > > >> consumer
>> > > > >> > end when handling large lags with big
>> *fetch.message.max.bytes. *
>> > > > >> >
>> > > > >> >
>> > > > >> > Thanks,
>> > > > >> > Dinesh
>> > > > >> >
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> --
>> > > > >> -- Guozhang
>> > > > >>
>> > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > -- Guozhang
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>


-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message