kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sybrandy, Casey" <Casey.Sybra...@Six3Systems.com>
Subject RE: Partial Message Read by Consumer
Date Wed, 11 Dec 2013 14:15:49 GMT

No, the entire log file isn't bigger than that buffer size and this is occurring while trying
to retrieve the first message on the topic, not the last.

I attached a log.  Line 408 (******** Iterating.) is where we get an iterator and start iterating
over the data.  There should be subsequent log entries displaying a filename, but they never
appear after that point.

Some other thoughts:

* Network latency is a non-issue as everything is installed on a local VM.
* I tried with both 10 and 100 messages in case I didn't have enough to make it start producing.
 No change.  Yes, I do realize this is silly, but when nothing else is working, why not give
it a try.  It's like adding magical print statements.

Hope this helps.  I need it.


From: Tom Brown [tombrown52@gmail.com]
Sent: Tuesday, December 10, 2013 7:10 PM
To: users@kafka.apache.org
Subject: Re: Partial Message Read by Consumer

Having a partial message transfer over the network is the design of Kafka
0.7.x (I can't speak to 0.8.x, though it may still be).

When the request is made, you tell the server the partition number, the
byte offset into that partition, and the size of response that you want.
The server finds that offset in the partition, and sends N bytes back
(where N is the maximum response size specified). The server does not
inspect the contents of the reply to ensure that message boundaries line up
with the response size. This is by design, and the simplicity allows for
high throughput, at the cost of higher client complexity. In practice this
means is that the response often includes a partial message at the end
which the client drops. This means that if the response contains a single
message is larger than your maximum response size, you will not be able to
process that message or continue to the next message. Each time you request
it, it will only send the partial message, and the Kafka client will send
the request again.

If I understand the high-level consumer configuration, the fetch.size
parameter should be what you need to adjust. It's default is 300K, but I
see you have it set to roughly 50MB. Is there any chance your message is
larger than that?


On Tue, Dec 10, 2013 at 1:52 PM, Guozhang Wang <wangguoz@gmail.com> wrote:

> Hello Casey,
> What do you mean by "part of a message is being read"? Could you upload the
> output and also the log of the consumer here?
> Guozhang
> On Tue, Dec 10, 2013 at 12:26 PM, Sybrandy, Casey <
> Casey.Sybrandy@six3systems.com> wrote:
> > Hello,
> >
> > First, I'm using version 0.7.2.
> >
> > I'm trying to read some messages from a broker, but looking at wireshark,
> > it appears that only part of a message is being read by the consumer.
> >  After that, no other data is read and I can verify that there are 10
> > messages on the broker.  I have the consumer configured as follows:
> >
> > kafka.zk.connectinfo=
> > kafka.zk.groupid=foo3
> > kafka.topic=...
> > fetch.size=52428800
> > socket.buffersize=524288
> >
> > I only set socket.buffersize today to see if it helps.  Any help would be
> > great because this is baffling, especially since this only started
> > happening yesterday.
> >
> > Casey Sybrandy MSWE
> > Six3Systems
> > Cyber and Enterprise Systems Group
> > www.six3systems.com
> > 301-206-6000 (Office)
> > 301-206-6020 (Fax)
> > 11820 West Market Place
> > Suites N-P
> > Fulton, MD. 20759
> >
> --
> -- Guozhang

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message