kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Koshy <jjko...@gmail.com>
Subject Re: Question - large number of topics
Date Fri, 19 Aug 2011 21:28:27 GMT
Hi Taylor,

The topic is not part of the wire protocol - the simple consumer is
not asynchronous. So the topic for each response is for whatever topic
the preceding fetch request was for.

Thanks,

Joel

On Fri, Aug 19, 2011 at 12:34 PM, Taylor Gautier <tgautier@tagged.com> wrote:
> Great!  How is this done?  I'm working with the node kafka client here:
>
> http://proxworx.appspot.com/github.com/marcuswestin/node-kafka
>
> And I think it only supports one topic so I need to update the code - when I
> was looking at the binary responses I wasn't sure how the response will be
> formatted so I can distinguish messages from different topics.
>
> (this discussion may be more appropriate for kafka-dev)
>
> On Fri, Aug 19, 2011 at 12:29 PM, Jun Rao <junrao@gmail.com> wrote:
>
>> Taylor,
>>
>> For topics stored on the same broker, kafka consumer can consume multiple
>> topics over a single socket connection.
>>
>> Jun
>>
>> On Fri, Aug 19, 2011 at 11:14 AM, Taylor Gautier <tgautier@tagged.com
>> >wrote:
>>
>> > Thanks for the responses.
>> >
>> > Coming back to this topic - on the wire protocol is it possible to
>> register
>> > interest for more than one topic - or is it 1:1 tcp connection to topic?
>> >
>> > Inspecting the binary formats it looks like it has to be 1:1.
>> >
>> > Thanks.
>> >
>> > On Fri, Jul 22, 2011 at 4:37 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
>> >
>> > > Hi Taylor,
>> > >
>> > > I think you are correct the single-node scalability for the number of
>> > > topics
>> > > is not that great due to having multiple files per topic. I think the
>> > large
>> > > directory problem can probably be mitigated by using a more modern
>> > > filesystem, but as you and Jun point out ZK may also be strained.
>> > >
>> > > One thing that may not be obvious is it is not required to keep all
>> > topics
>> > > on all machines, this will help scale the non-zk aspects. To do this
>> you
>> > > can
>> > > either pre-create the topics or else add a custom partitioner which
>> maps
>> > > particular topics only to a subset of machines. In this way if you had,
>> > say
>> > > 15 machines you could spread each topic over 3 machines and get 5X the
>> > max
>> > > number of topics.
>> > >
>> > > -Jay
>> > >
>> > > On Fri, Jul 22, 2011 at 2:06 PM, Jun Rao <junrao@gmail.com> wrote:
>> > >
>> > > > Hi, Tayler,
>> > > >
>> > > > That's a good question. As your pointed out, a large number of topics
>> > > will
>> > > > put stress on local file directory and ZK. Maybe you can do a bit
>> > testing
>> > > > first to see what breaks with a large number of topics. After that,
>> we
>> > > can
>> > > > look into what needs to be fixed. Making the directory structure
>> > > > hierarchical is a possibility.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Jun
>> > > >
>> > > >
>> > > > On Fri, Jul 22, 2011 at 1:23 PM, Taylor Gautier <tgautier@tagged.com
>> >
>> > > > wrote:
>> > > >
>> > > > > Hi.
>> > > > >
>> > > > > I am thinking to use kafka to send/receive messages for a large
>> > number
>> > > of
>> > > > > topics - order of 100k - 1M.
>> > > > >
>> > > > > It seems that the directory structure used for topics will probably
>> > not
>> > > > > work
>> > > > > for this usage.  Also, I'm not sure if the in-memory data
>> structures
>> > > > might
>> > > > > suffer - and also it may be problematic for zookeeper.
>> > > > >
>> > > > > One thought I have is to modify the directory structure to be
a
>> tree
>> > of
>> > > > > directories.  Not sure what if anything might need to be done
to
>> > > > in-memory
>> > > > > structures or zookeeper info.
>> > > > >
>> > > > > Any thoughts?
>> > > > >
>> > > >
>> > >
>> >
>>
>

Mime
View raw message