kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dinesh kumar <dinesh...@gmail.com>
Subject Re: Consumer not getting data when there is a big lag in the topic
Date Wed, 28 Jan 2015 20:28:20 GMT
Hi Guozhang,
Sorry for the delayed response. We had some hardware issues and I was not
able to give you the logs you asked for. We upgraded to 0.8.2-beta version
of kafka
in our cluster but that did not help solve the issue.

To make more lucid, I am attaching a sample code that can be used to repro
the issue on 0.8.1.1 version of  kafka (running locally is also fine).
Also I have put the log files from my experiments in the attachment under
logs/ directory.

*logs/error_entires_consumer.log *- error entries that show up in the
consumer logs
*logs/error_logs_kafka_request.log *- error from the kafka-request.log in
the server.

To run the repro, please go through the attached README file.

Let me know if you need any more info.

Thanks,
Dinesh


On 15 January 2015 at 06:00, Guozhang Wang <wangguoz@gmail.com> wrote:

> Could you check both the server logs and the consumer logs (with and
> without the config specified) and see if there are any error entries /
> exception logs?
>
> Guozhang
>
> On Wed, Jan 14, 2015 at 1:53 PM, dinesh kumar <dinesh12b@gmail.com> wrote:
>
> > We don't have any compression on Kafka
> >
> > On 14 January 2015 at 22:54, Guozhang Wang <wangguoz@gmail.com> wrote:
> >
> > > Did you have compression enabled on Kafka?
> > >
> > > On Tue, Jan 13, 2015 at 10:33 AM, dinesh kumar <dinesh12b@gmail.com>
> > > wrote:
> > >
> > > > We are using 0.8.1.1 version of Kafka and *not 0.8.2 *as mentioned
> > above.
> > > >
> > > > Thanks,
> > > > Dinesh
> > > >
> > > > On 13 January 2015 at 23:35, dinesh kumar <dinesh12b@gmail.com>
> wrote:
> > > >
> > > > > Hi Guozhang,
> > > > > Sorry for the misinformation. We have file sizes around 50 - 100
> MB.
> > So
> > > > we
> > > > > set *fetch.message.max.bytes* conservatively around 188743680.  Can
> > you
> > > > > please explain me the reason for this behavior?
> > > > >
> > > > > Thanks,
> > > > > Dinesh
> > > > >
> > > > > On 13 January 2015 at 21:42, Guozhang Wang <wangguoz@gmail.com>
> > wrote:
> > > > >
> > > > >> Dinesh,
> > > > >>
> > > > >> Your fetch.message.max.bytes is 188743680 < 155MB, but you
said
> some
> > > > >> messages can be as large as 180MB. Could you try to set it to
be
> > > larger
> > > > >> than, say 200MB and see if it helps?
> > > > >>
> > > > >> Guozhang
> > > > >>
> > > > >> On Tue, Jan 13, 2015 at 4:18 AM, dinesh kumar <
> dinesh12b@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi,
> > > > >> > I am been facing some JAVA high level consumer related issues
> > lately
> > > > and
> > > > >> > would like to understand more on this.
> > > > >> >
> > > > >> > We have 9 bare-metals (48 core, 250 GB, Terabytes of Hard
disks)
> > > > running
> > > > >> > *Kafka
> > > > >> > 0.8.2* and 5 independent VM (8 core, 60 GB) running zookeeper.
> > > > >> >
> > > > >> > I have a topic that has key as metadata and value as a file.
The
> > > file
> > > > >> can
> > > > >> > be as large as *180 MB.* We have a topic with 90 partitions.
> > > Sometimes
> > > > >> > there will be only one consumer consuming from the topic.
When
> the
> > > > >> consumer
> > > > >> > group for my topic has a *lag in the range of 200's* and
when I
> > > start
> > > > a
> > > > >> > consumer (no other consumer running before) there is *no
data*
> > > coming
> > > > >> > through to the consumer.
> > > > >> >
> > > > >> > Please find below my consumer parameters.
> > > > >> >
> > > > >> > "zookeeper.connect"                => <zookeepers>,
> > > > >> > "group.id"                         => "default",
> > > > >> > "consumer.timeout.ms"              => "-1",
> > > > >> > "auto.offset.reset"                => "smallest",
> > > > >> > "auto.commit.enable"               => "false",
> > > > >> > "consumer.timeout.ms"          => "-1",
> > > > >> > "zookeeper.session.timeout.ms" => "100000",
> > > > >> > "zookeeper.connection.timeout.ms"  => "6000",
> > > > >> > "zookeeper.sync.time.ms"           => "2000",
> > > > >> > "rebalance.backoff.ms"             =>  "20000",
> > > > >> > "rebalance.max.retries"            => "50"
> > > > >> > "fetch.message.max.bytes"      => "188743680",
> > > > >> > "fetch.size"                   => "18874368"
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > This problem occurs only when the *auto.offset.reset *property
> is
> > > > >> > *smallest.
> > > > >> > *I am able to get data if the offset is largest. I tried
using
> the
> > > > >> *console
> > > > >> > consumer* for the same topic and consumer group with
> > > > *--from-beginning*
> > > > >> > option, I can see the data getting printed. I looked into
the
> > > > >> > ConsoleConsumer code and I saw that there was no
> > > > >> > *fetch.message.max.bytes *property
> > > > >> > in the consumer option.
> > > > >> >
> > > > >> > So I removed the *fetch.message.max.bytes *from my code
and the
> > > > consumer
> > > > >> > started working but was throwing exception when the message
is
> > > large.
> > > > >> >
> > > > >> > So *fetch.message.max.bytes *seemed to be the problem but
I
> cannot
> > > do
> > > > >> > without it as my messages a big files. Can someone explain
to me
> > > what
> > > > is
> > > > >> > the issue here? I also adjusted the *fetch.size *parameter
> > according
> > > > to
> > > > >> my
> > > > >> > max message size but it did not help.
> > > > >> >
> > > > >> >
> > > > >> > To summerize, I would like to understand what is happening
in
> the
> > > > >> consumer
> > > > >> > end when handling large lags with big *fetch.message.max.bytes.
> *
> > > > >> >
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Dinesh
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> -- Guozhang
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Mime
View raw message