kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: Partial Message Read by Consumer
Date Wed, 11 Dec 2013 17:09:28 GMT
Do you have compression turned on in the broker?

Guozhang


On Wed, Dec 11, 2013 at 8:43 AM, Sybrandy, Casey <
Casey.Sybrandy@six3systems.com> wrote:

> First, I saw the partial message looking at raw network traffic via
> Wireshark, not the output of the iterator as the iterator never seems to
> provide me any data.  That's where the code is hanging.
>
> Second, here's the output from the ConsumerOffsetChecker:
>
> grp1,tdf_topic,0-0 (Group,Topic,BrokerId-PartitionId)
>             Owner = null
>   Consumer offset = 47947
>                   = 47,947 (0.00G)
>          Log size = 1743252
>                   = 1,743,252 (0.00G)
>      Consumer lag = 1695305
>                   = 1,695,305 (0.00G)
>
> BROKER INFO
> 0 -> 127.0.1.1:9092
>
> To answer the questions related to this in the FAQ:
>
> * Yes, there are more messages.
> * No, the messages are all smaller than my configured fetch size.
> * As far as I know, the consumer thread did not stop.  There are no errors
> or exceptions to indicate anything of the sort.
>
> One thing I did notice is that it looks like it's reading from the topic
> before the consumer thread actually starts.  I'm using the pattern where I
> start a new thread per stream and submit them to an ExecutorService.  Not
> sure if this makes a difference, but this is our standard consumer pattern
> and has worked well until I started seeing this issue.  For this consumer,
> I'm only working with one stream.  I tried 2, but no change.
>
> Casey
> ________________________________________
> From: Guozhang Wang [wangguoz@gmail.com]
> Sent: Wednesday, December 11, 2013 11:31 AM
> To: users@kafka.apache.org
> Subject: Re: Partial Message Read by Consumer
>
> Casey,
>
> Just to confirm, you saw a partial message output from the iterator.next()
> call, not from the consumer's fetch response, correct?
>
> Guozhang
>
>
> On Wed, Dec 11, 2013 at 8:14 AM, Jun Rao <junrao@gmail.com> wrote:
>
> > Have you looked at
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F
> > ?
> > If that doesn't help, could you file a jira and attach your log?
> > Apache
> > mailing list doesn't support attachments.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Dec 11, 2013 at 6:15 AM, Sybrandy, Casey <
> > Casey.Sybrandy@six3systems.com> wrote:
> >
> > > Hello,
> > >
> > > No, the entire log file isn't bigger than that buffer size and this is
> > > occurring while trying to retrieve the first message on the topic, not
> > the
> > > last.
> > >
> > > I attached a log.  Line 408 (******** Iterating.) is where we get an
> > > iterator and start iterating over the data.  There should be subsequent
> > log
> > > entries displaying a filename, but they never appear after that point.
> > >
> > > Some other thoughts:
> > >
> > > * Network latency is a non-issue as everything is installed on a local
> > VM.
> > > * I tried with both 10 and 100 messages in case I didn't have enough to
> > > make it start producing.  No change.  Yes, I do realize this is silly,
> > but
> > > when nothing else is working, why not give it a try.  It's like adding
> > > magical print statements.
> > >
> > > Hope this helps.  I need it.
> > >
> > > Casey
> > >
> > > ________________________________________
> > > From: Tom Brown [tombrown52@gmail.com]
> > > Sent: Tuesday, December 10, 2013 7:10 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Partial Message Read by Consumer
> > >
> > > Having a partial message transfer over the network is the design of
> Kafka
> > > 0.7.x (I can't speak to 0.8.x, though it may still be).
> > >
> > > When the request is made, you tell the server the partition number, the
> > > byte offset into that partition, and the size of response that you
> want.
> > > The server finds that offset in the partition, and sends N bytes back
> > > (where N is the maximum response size specified). The server does not
> > > inspect the contents of the reply to ensure that message boundaries
> line
> > up
> > > with the response size. This is by design, and the simplicity allows
> for
> > > high throughput, at the cost of higher client complexity. In practice
> > this
> > > means is that the response often includes a partial message at the end
> > > which the client drops. This means that if the response contains a
> single
> > > message is larger than your maximum response size, you will not be able
> > to
> > > process that message or continue to the next message. Each time you
> > request
> > > it, it will only send the partial message, and the Kafka client will
> send
> > > the request again.
> > >
> > > If I understand the high-level consumer configuration, the fetch.size
> > > parameter should be what you need to adjust. It's default is 300K, but
> I
> > > see you have it set to roughly 50MB. Is there any chance your message
> is
> > > larger than that?
> > >
> > > --Tom
> > >
> > >
> > > On Tue, Dec 10, 2013 at 1:52 PM, Guozhang Wang <wangguoz@gmail.com>
> > wrote:
> > >
> > > > Hello Casey,
> > > >
> > > > What do you mean by "part of a message is being read"? Could you
> upload
> > > the
> > > > output and also the log of the consumer here?
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Tue, Dec 10, 2013 at 12:26 PM, Sybrandy, Casey <
> > > > Casey.Sybrandy@six3systems.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > First, I'm using version 0.7.2.
> > > > >
> > > > > I'm trying to read some messages from a broker, but looking at
> > > wireshark,
> > > > > it appears that only part of a message is being read by the
> consumer.
> > > > >  After that, no other data is read and I can verify that there are
> 10
> > > > > messages on the broker.  I have the consumer configured as follows:
> > > > >
> > > > > kafka.zk.connectinfo=127.0.0.1
> > > > > kafka.zk.groupid=foo3
> > > > > kafka.topic=...
> > > > > fetch.size=52428800
> > > > > socket.buffersize=524288
> > > > >
> > > > > I only set socket.buffersize today to see if it helps.  Any help
> > would
> > > be
> > > > > great because this is baffling, especially since this only started
> > > > > happening yesterday.
> > > > >
> > > > > Casey Sybrandy MSWE
> > > > > Six3Systems
> > > > > Cyber and Enterprise Systems Group
> > > > > www.six3systems.com
> > > > > 301-206-6000 (Office)
> > > > > 301-206-6020 (Fax)
> > > > > 11820 West Market Place
> > > > > Suites N-P
> > > > > Fulton, MD. 20759
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message