kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: New consumer not fetching as quickly as possible
Date Thu, 03 Dec 2015 19:11:45 GMT
Good to know. Thanks Tao.

On Wed, Dec 2, 2015 at 5:42 PM, tao xiao <xiaotao183@gmail.com> wrote:

> It does help with increasing the poll timeout to Long.MAX_VALUE. I got
> messages in every poll but just the time between each poll is long. That is
> how I discovered it was an network issue btw consumer and broker.  I
> believe it will have the same effect as long as I set the poll timeout high
> enough, not necessary to be Long.MAX_VALUE.
>
> On Thu, 3 Dec 2015 at 02:04 Guozhang Wang <wangguoz@gmail.com> wrote:
>
> > Thanks for the updates Tao.
> >
> > Just wanted to make sure that there is no other potential issues when
> > consumer and broker are remote, which is also quite common in practice:
> if
> > you increase the timeout value in poll(timeout) to even larger values
> (say
> > two times the average latency in your network) and also set the
> > request.timeout.ms config to be large enough as well, does that resolve
> > the
> > issue even if your consumer is not co-located?
> >
> > Guozhang
> >
> > On Wed, Dec 2, 2015 at 12:46 AM, tao xiao <xiaotao183@gmail.com> wrote:
> >
> > > It turned out it was due to network latency btw consumer and broker.
> > Once
> > > I moved the consumer to the same box of broker messages were returned
> in
> > > every poll.
> > >
> > > Thanks for all the helps.
> > >
> > > On Wed, 2 Dec 2015 at 15:38 Gerard Klijs <gerard.klijs@dizzit.com>
> > wrote:
> > >
> > > > Another possible reason witch comes to me mind is that you have
> > multiple
> > > > consumer threads, but not the partitions/brokers to support them.
> When
> > > I'm
> > > > running my tool on multiple threads I get a lot of time-outs. When I
> > only
> > > > use one consumer thread I get them only at the start and the end.
> > > >
> > > > On Wed, Dec 2, 2015 at 5:43 AM Jason Gustafson <jason@confluent.io>
> > > wrote:
> > > >
> > > > > There is some initial overhead before data can be fetched. For
> > example,
> > > > the
> > > > > group has to be joined and topic metadata has to be fetched. Do you
> > see
> > > > > unexpected empty fetches beyond the first 10 polls?
> > > > >
> > > > > Thanks,
> > > > > Jason
> > > > >
> > > > > On Tue, Dec 1, 2015 at 7:43 PM, tao xiao <xiaotao183@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi Jason,
> > > > > >
> > > > > > You are correct. I initially produced 10000 messages in Kafka
> > before
> > > I
> > > > > > started up my consumer with auto.offset.reset=earliest. But
like
> I
> > > said
> > > > > the
> > > > > > majority number of first 10 polls returned 0 message and the
lag
> > > > remained
> > > > > > above 0 which means I still have enough messages to consume.
> BTW I
> > > > > commit
> > > > > > offset manually so the lag should accurately reflect how many
> > > messages
> > > > > > remaining.
> > > > > >
> > > > > > I will turn on debug logging and test again.
> > > > > >
> > > > > > On Wed, 2 Dec 2015 at 07:17 Jason Gustafson <jason@confluent.io>
> > > > wrote:
> > > > > >
> > > > > > > Hey Tao, other than high latency between the brokers and
the
> > > > consumer,
> > > > > > I'm
> > > > > > > not sure what would cause this. Can you turn on debug logging
> and
> > > run
> > > > > > > again? I'm looking for any connection problems or
> metadata/fetch
> > > > > request
> > > > > > > errors. And I have to ask a dumb question, how do you know
that
> > > more
> > > > > > > messages are available? Are you monitoring the consumer's
lag?
> > > > > > >
> > > > > > > -Jason
> > > > > > >
> > > > > > > On Tue, Dec 1, 2015 at 10:07 AM, Gerard Klijs <
> > > > gerard.klijs@dizzit.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Tao, it worked.
> > > > > > > > I also played around with my test setting trying to
replicate
> > > your
> > > > > > > results,
> > > > > > > > using default settings. But als long as the poll timeout
is
> set
> > > to
> > > > > > 100ms
> > > > > > > or
> > > > > > > > larger the only time-out I get are near the start
and near
> the
> > > end
> > > > > > (when
> > > > > > > > indeed there is nothing to consume). This is with
a producer
> > > > putting
> > > > > > out
> > > > > > > > 1000 messages a second. Maybe the load of the producer
your
> > using
> > > > is
> > > > > > not
> > > > > > > > constant? Maybe you could run a test with the
> > > > > > > > org.apache.kafka.tools.ProducerPerformance class to
see if it
> > > > makes a
> > > > > > > > difference?
> > > > > > > >
> > > > > > > > On Tue, Dec 1, 2015 at 11:35 AM tao xiao <
> xiaotao183@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Gerard,
> > > > > > > > >
> > > > > > > > > In your case I think you can set fetch.min.bytes=1
so that
> > the
> > > > > server
> > > > > > > > will
> > > > > > > > > answer the fetch request as soon as a single
byte of data
> is
> > > > > > available
> > > > > > > > > instead of accumulating enough messages.
> > > > > > > > >
> > > > > > > > > But in my case is I have plenty of messages in
broker and I
> > am
> > > > sure
> > > > > > the
> > > > > > > > > size of total message are much larger than the
default
> > setting
> > > > > which
> > > > > > is
> > > > > > > > > 1024 bytes but still the consumer doesn't return
messages
> for
> > > > every
> > > > > > > poll.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, 1 Dec 2015 at 18:29 Gerard Klijs <
> > > > gerard.klijs@dizzit.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I was experimenting with the timeout setting,
but as long
> > as
> > > > > > messages
> > > > > > > > are
> > > > > > > > > > produced and the consumer(s) keep polling
I saw little
> > > > > difference.
> > > > > > I
> > > > > > > > did
> > > > > > > > > > see for example that when producing only
1 message a
> > second,
> > > > > still
> > > > > > it
> > > > > > > > > > sometimes wait to get three messages. So
I also would
> like
> > to
> > > > > know
> > > > > > if
> > > > > > > > > there
> > > > > > > > > > is a faster way.
> > > > > > > > > >
> > > > > > > > > > On Tue, Dec 1, 2015 at 10:35 AM tao xiao
<
> > > xiaotao183@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi team,
> > > > > > > > > > >
> > > > > > > > > > > I am using the new consumer with broker
version 0.9.0.
> I
> > > > notice
> > > > > > > that
> > > > > > > > > > > poll(time) occasionally returns 0 message
even though I
> > > have
> > > > > > enough
> > > > > > > > > > > messages in broker. The rate of returning
0 message is
> > > quite
> > > > > high
> > > > > > > > like
> > > > > > > > > 4
> > > > > > > > > > > out of 5 polls return 0 message. It
doesn't help by
> > > > increasing
> > > > > > the
> > > > > > > > poll
> > > > > > > > > > > timeout from 300ms to 1 second. are
there any
> > > configurations
> > > > > > that I
> > > > > > > > can
> > > > > > > > > > > tune to fetch  data as quickly as possible?
> > > > > > > > > > >
> > > > > > > > > > > Both consumer and broker configs are
default
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message