kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Re: General questions on functionality and usage
Date Fri, 02 Dec 2011 18:09:30 GMT
Could you mind explaining how you go about:

(1) partitioning and load balancing data across a cluster of machines


On 12/2/11 6:42 AM, Jay Kreps wrote:
> I think there are two things here: (1) partitioning and load balancing data
> across a cluster of machines, and (2) replicating each message on N
> machines. We do (1) but not (2). We are working on (2), as Jun says.
>
> -Jay
>
> On Thu, Dec 1, 2011 at 5:29 PM, Jun Rao<junrao@gmail.com>  wrote:
>
>> No, multiple servers in each cluster.
>>
>> Jun
>>
>> On Thu, Dec 1, 2011 at 4:48 PM, Mark<static.void.dev@gmail.com>  wrote:
>>
>>> So at linked in you only use 1 kafka server?
>>>
>>>
>>> On 12/1/11 9:12 AM, Jun Rao wrote:
>>>
>>>> Mark,
>>>>
>>>> See my inlined answers below.
>>>>
>>>> Thanks,
>>>>
>>>> Jun
>>>>
>>>> On Thu, Dec 1, 2011 at 8:28 AM, Mark<static.void.dev@gmail.com**>
>>   wrote:
>>>>   - Does Kafka support pattern matching?
>>>>>   There is no server-side filtering in Kafka right now.
>>>>
>>>>   - What are the limitations of one Kafka server in terms of number of
>>>>> topics and number of consumers?
>>>>>
>>>>>   There is no hard limit. However, at LinkedIn, we are dealing with
>>>> hundreds
>>>> of topics and tens of consumers. Large # of topics/consumers could be
>>>> limited by ZK capacity and OS capacity (e.g., open file handlers). Also,
>>>> if
>>>> a consumer consumes a large number of topics, time to balance load will
>> be
>>>> longer.
>>>>
>>>>
>>>>   - Can you load balance publishing/subscribing across multiple Kafka
>>>>> servers to increase redundancy?
>>>>>
>>>>>
>>>>>   It's possible, but it's not something that's built-in now. We do plan
>> to
>>>> support intra-cluster replication. See the design in
>>>> https://issues.apache.org/**jira/browse/KAFKA-50<
>> https://issues.apache.org/jira/browse/KAFKA-50>
>>>>
>>>>   - Other than lack of map/reduce support how does Kafka differ than say
>>>>> Redis Pub/Sub? (http://redis.io/topics/**pubsub**<
>> http://redis.io/topics/pubsub**>
>>>>> )
>>>>>
>>>>>
>>>>>   Don't know about Redis Pub/Sub. However, Kafka differs from some other
>>>> pub/sub/messaging systems in that it focuses more on scalability,
>>>> efficiency, and throughput.
>>>>
>>>>
>>>>   - Would anyone mind sharing their Kafka setup in terms of both
>>>>> functionality/usage and architecture... basically more in depth than
>> the
>>>>> usual "Kafka servers our realt-time X" (https://cwiki.apache.org/**
>>>>> confluence/display/KAFKA/****Powered+By<https://cwiki.**
>>>>> apache.org/confluence/display/**KAFKA/Powered+By<
>> https://cwiki.apache.org/confluence/display/KAFKA/Powered+By>
>>>>>> ).
>>>>> Having concrete use cases on the wiki could help gain adoption,
>>>>> especially
>>>>> to new users of the pub/sub paradigm, by showing what the powers of
>>>>> pub/sub
>>>>> real-time messaging can accomplish.
>>>>>
>>>>>
>>>>>   Yes, we will update the wiki later.
>>>>
>>>>   - Any good papers on what problems pub/sub in general can solve?
>>>>>
>>>>>   Some of the design and usage of Kafka can be found in this paper:
>>>> http://research.microsoft.com/**en-us/um/people/srikanth/**
>>>> netdb11/netdb11papers/netdb11-**final12.pdf<
>> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>>
>>>>>

Mime
View raw message