kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Sanderson <gra...@vast.com>
Subject Re: A few questions from a new user
Date Sat, 07 Jul 2012 23:37:27 GMT
Thanks Niek good point; I can certainly do that and limit the payload arbitrarily or based
on size (I just need to include my "contextual" information again in the next message) or
latency requirements (which I was planning to do for the message set too)

I guess I sort of thought that messagesets might be a built in concept that might do what
I wanted already

Since many of my payloads may be smaller than the message header anyway, this will probably
save me a bunch of space too

On Jul 7, 2012, at 3:30 PM, Niek Sanders <niek.sanders@gmail.com> wrote:

> Graham,
> If you have a collection of data that should always be sent and
> consumed together and in-order, why not send it using a single Kafka
> message?  Or is the payload really huge?
> - Niek
> On Sat, Jul 7, 2012 at 10:18 AM, graham sanderson <graham@vast.com> wrote:
>> 1) I would like to guarantee that a group of messages are always delivered in their
entirety together (because there is contextual information in messages which precede other
messages). I'm a little confused by the use of the term "nested message sets" since I don't
really see much in the code (though II don't really know Scala) - perhaps this refers to the
fact that you can have a set of messages within a message set file on disk. Anyway, I was
curious (and I'm using the Java api now, but may move to the Scala later) what I need to do
to guarantee N messages are sent and delivered as a single message set; is a single ProducerData
with a List of messages always sent as a single message set? does compression need to be turned
on? how does this affect network limits etc. (i.e. does the entire message set have to fit).
I'm also assuming that once I have my message set containing all my messages it will be discarded
in its entirety.
>> 2) Related to 1) from the consumer side, can I tell the boundaries of a message set
(perhaps not required for me), but nevertheless I do want to make sure I receive the entire
set in one go (again do I have to set network limits accordingly). The docs say that the entire
message set is always delivered to the client when compressed, but I'm not sure if it can
be subdivided if not compressed. Note I'm happy to stick with compression if required.
>> 3) So I'm using the ZookeeperConsumerConnector, since I don't want to manage finding
the brokers myself, however I was wondering if there are any plans to decouple the consumer
offset tracking from the former. One of my use cases is that I'll have a lot of ad-hoc one
off consumers that simply read a subset of data until they die - from looking at ConsoleConsumer,
there is currently a hack to simply delete the zookeeper info after the fact to get around
>> Thanks,
>> Graham.

View raw message