kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerard Klijs <gerard.kl...@dizzit.com>
Subject Re: New consumer not fetching as quickly as possible
Date Wed, 02 Dec 2015 07:38:35 GMT
Another possible reason witch comes to me mind is that you have multiple
consumer threads, but not the partitions/brokers to support them. When I'm
running my tool on multiple threads I get a lot of time-outs. When I only
use one consumer thread I get them only at the start and the end.

On Wed, Dec 2, 2015 at 5:43 AM Jason Gustafson <jason@confluent.io> wrote:

> There is some initial overhead before data can be fetched. For example, the
> group has to be joined and topic metadata has to be fetched. Do you see
> unexpected empty fetches beyond the first 10 polls?
>
> Thanks,
> Jason
>
> On Tue, Dec 1, 2015 at 7:43 PM, tao xiao <xiaotao183@gmail.com> wrote:
>
> > Hi Jason,
> >
> > You are correct. I initially produced 10000 messages in Kafka before I
> > started up my consumer with auto.offset.reset=earliest. But like I said
> the
> > majority number of first 10 polls returned 0 message and the lag remained
> > above 0 which means I still have enough messages to consume.  BTW I
> commit
> > offset manually so the lag should accurately reflect how many messages
> > remaining.
> >
> > I will turn on debug logging and test again.
> >
> > On Wed, 2 Dec 2015 at 07:17 Jason Gustafson <jason@confluent.io> wrote:
> >
> > > Hey Tao, other than high latency between the brokers and the consumer,
> > I'm
> > > not sure what would cause this. Can you turn on debug logging and run
> > > again? I'm looking for any connection problems or metadata/fetch
> request
> > > errors. And I have to ask a dumb question, how do you know that more
> > > messages are available? Are you monitoring the consumer's lag?
> > >
> > > -Jason
> > >
> > > On Tue, Dec 1, 2015 at 10:07 AM, Gerard Klijs <gerard.klijs@dizzit.com
> >
> > > wrote:
> > >
> > > > Thanks Tao, it worked.
> > > > I also played around with my test setting trying to replicate your
> > > results,
> > > > using default settings. But als long as the poll timeout is set to
> > 100ms
> > > or
> > > > larger the only time-out I get are near the start and near the end
> > (when
> > > > indeed there is nothing to consume). This is with a producer putting
> > out
> > > > 1000 messages a second. Maybe the load of the producer your using is
> > not
> > > > constant? Maybe you could run a test with the
> > > > org.apache.kafka.tools.ProducerPerformance class to see if it makes a
> > > > difference?
> > > >
> > > > On Tue, Dec 1, 2015 at 11:35 AM tao xiao <xiaotao183@gmail.com>
> wrote:
> > > >
> > > > > Gerard,
> > > > >
> > > > > In your case I think you can set fetch.min.bytes=1 so that the
> server
> > > > will
> > > > > answer the fetch request as soon as a single byte of data is
> > available
> > > > > instead of accumulating enough messages.
> > > > >
> > > > > But in my case is I have plenty of messages in broker and I am sure
> > the
> > > > > size of total message are much larger than the default setting
> which
> > is
> > > > > 1024 bytes but still the consumer doesn't return messages for every
> > > poll.
> > > > >
> > > > >
> > > > > On Tue, 1 Dec 2015 at 18:29 Gerard Klijs <gerard.klijs@dizzit.com>
> > > > wrote:
> > > > >
> > > > > > I was experimenting with the timeout setting, but as long as
> > messages
> > > > are
> > > > > > produced and the consumer(s) keep polling I saw little
> difference.
> > I
> > > > did
> > > > > > see for example that when producing only 1 message a second,
> still
> > it
> > > > > > sometimes wait to get three messages. So I also would like to
> know
> > if
> > > > > there
> > > > > > is a faster way.
> > > > > >
> > > > > > On Tue, Dec 1, 2015 at 10:35 AM tao xiao <xiaotao183@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi team,
> > > > > > >
> > > > > > > I am using the new consumer with broker version 0.9.0.
I notice
> > > that
> > > > > > > poll(time) occasionally returns 0 message even though I
have
> > enough
> > > > > > > messages in broker. The rate of returning 0 message is
quite
> high
> > > > like
> > > > > 4
> > > > > > > out of 5 polls return 0 message. It doesn't help by increasing
> > the
> > > > poll
> > > > > > > timeout from 300ms to 1 second. are there any configurations
> > that I
> > > > can
> > > > > > > tune to fetch  data as quickly as possible?
> > > > > > >
> > > > > > > Both consumer and broker configs are default
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message