kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sybrandy, Casey" <Casey.Sybra...@Six3Systems.com>
Subject RE: Partial Message Read by Consumer
Date Wed, 11 Dec 2013 16:43:25 GMT
First, I saw the partial message looking at raw network traffic via Wireshark, not the output
of the iterator as the iterator never seems to provide me any data.  That's where the code
is hanging.

Second, here's the output from the ConsumerOffsetChecker:

grp1,tdf_topic,0-0 (Group,Topic,BrokerId-PartitionId)
            Owner = null
  Consumer offset = 47947
                  = 47,947 (0.00G)
         Log size = 1743252
                  = 1,743,252 (0.00G)
     Consumer lag = 1695305
                  = 1,695,305 (0.00G)

BROKER INFO
0 -> 127.0.1.1:9092

To answer the questions related to this in the FAQ:

* Yes, there are more messages.
* No, the messages are all smaller than my configured fetch size.
* As far as I know, the consumer thread did not stop.  There are no errors or exceptions to
indicate anything of the sort.

One thing I did notice is that it looks like it's reading from the topic before the consumer
thread actually starts.  I'm using the pattern where I start a new thread per stream and submit
them to an ExecutorService.  Not sure if this makes a difference, but this is our standard
consumer pattern and has worked well until I started seeing this issue.  For this consumer,
I'm only working with one stream.  I tried 2, but no change.

Casey
________________________________________
From: Guozhang Wang [wangguoz@gmail.com]
Sent: Wednesday, December 11, 2013 11:31 AM
To: users@kafka.apache.org
Subject: Re: Partial Message Read by Consumer

Casey,

Just to confirm, you saw a partial message output from the iterator.next()
call, not from the consumer's fetch response, correct?

Guozhang


On Wed, Dec 11, 2013 at 8:14 AM, Jun Rao <junrao@gmail.com> wrote:

> Have you looked at
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F
> ?
> If that doesn't help, could you file a jira and attach your log?
> Apache
> mailing list doesn't support attachments.
>
> Thanks,
>
> Jun
>
>
> On Wed, Dec 11, 2013 at 6:15 AM, Sybrandy, Casey <
> Casey.Sybrandy@six3systems.com> wrote:
>
> > Hello,
> >
> > No, the entire log file isn't bigger than that buffer size and this is
> > occurring while trying to retrieve the first message on the topic, not
> the
> > last.
> >
> > I attached a log.  Line 408 (******** Iterating.) is where we get an
> > iterator and start iterating over the data.  There should be subsequent
> log
> > entries displaying a filename, but they never appear after that point.
> >
> > Some other thoughts:
> >
> > * Network latency is a non-issue as everything is installed on a local
> VM.
> > * I tried with both 10 and 100 messages in case I didn't have enough to
> > make it start producing.  No change.  Yes, I do realize this is silly,
> but
> > when nothing else is working, why not give it a try.  It's like adding
> > magical print statements.
> >
> > Hope this helps.  I need it.
> >
> > Casey
> >
> > ________________________________________
> > From: Tom Brown [tombrown52@gmail.com]
> > Sent: Tuesday, December 10, 2013 7:10 PM
> > To: users@kafka.apache.org
> > Subject: Re: Partial Message Read by Consumer
> >
> > Having a partial message transfer over the network is the design of Kafka
> > 0.7.x (I can't speak to 0.8.x, though it may still be).
> >
> > When the request is made, you tell the server the partition number, the
> > byte offset into that partition, and the size of response that you want.
> > The server finds that offset in the partition, and sends N bytes back
> > (where N is the maximum response size specified). The server does not
> > inspect the contents of the reply to ensure that message boundaries line
> up
> > with the response size. This is by design, and the simplicity allows for
> > high throughput, at the cost of higher client complexity. In practice
> this
> > means is that the response often includes a partial message at the end
> > which the client drops. This means that if the response contains a single
> > message is larger than your maximum response size, you will not be able
> to
> > process that message or continue to the next message. Each time you
> request
> > it, it will only send the partial message, and the Kafka client will send
> > the request again.
> >
> > If I understand the high-level consumer configuration, the fetch.size
> > parameter should be what you need to adjust. It's default is 300K, but I
> > see you have it set to roughly 50MB. Is there any chance your message is
> > larger than that?
> >
> > --Tom
> >
> >
> > On Tue, Dec 10, 2013 at 1:52 PM, Guozhang Wang <wangguoz@gmail.com>
> wrote:
> >
> > > Hello Casey,
> > >
> > > What do you mean by "part of a message is being read"? Could you upload
> > the
> > > output and also the log of the consumer here?
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, Dec 10, 2013 at 12:26 PM, Sybrandy, Casey <
> > > Casey.Sybrandy@six3systems.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > First, I'm using version 0.7.2.
> > > >
> > > > I'm trying to read some messages from a broker, but looking at
> > wireshark,
> > > > it appears that only part of a message is being read by the consumer.
> > > >  After that, no other data is read and I can verify that there are 10
> > > > messages on the broker.  I have the consumer configured as follows:
> > > >
> > > > kafka.zk.connectinfo=127.0.0.1
> > > > kafka.zk.groupid=foo3
> > > > kafka.topic=...
> > > > fetch.size=52428800
> > > > socket.buffersize=524288
> > > >
> > > > I only set socket.buffersize today to see if it helps.  Any help
> would
> > be
> > > > great because this is baffling, especially since this only started
> > > > happening yesterday.
> > > >
> > > > Casey Sybrandy MSWE
> > > > Six3Systems
> > > > Cyber and Enterprise Systems Group
> > > > www.six3systems.com
> > > > 301-206-6000 (Office)
> > > > 301-206-6020 (Fax)
> > > > 11820 West Market Place
> > > > Suites N-P
> > > > Fulton, MD. 20759
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>



--
-- Guozhang

Mime
View raw message