kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kane Kane <kane.ist...@gmail.com>
Subject Re: Architecture: amount of partitions
Date Fri, 08 Aug 2014 20:42:29 GMT
Hello Guozhang,

Is storing offsets in kafka topic already in master branch?
We would like to use that feature, when do you plan to release 0.8.2?
Can we use master branch meanwhile (i.e. is it stable enough).

Thanks.

On Fri, Aug 8, 2014 at 1:38 PM, Guozhang Wang <wangguoz@gmail.com> wrote:
> Hi Roman,
>
> Current Kafka messaging guarantee is at-least once, and we are working on
> transactional messaging features to make it exactly once. We are expecting
> it to be used as synchronization/replication layer for some storage systems
> as your use case after that.
>
> As for your design, since you will probably have a lot of users and each
> user's data is small, you will end up with many small files on Kafka. If
> all you want is order preserving per user, you can probably just use
> keyed-messages with key as the user id, by that all messages with the same
> key will end up into the same partition and hence consumed by the same
> consumer client. With that you only need a fixed small number of partitions.
>
> Guozhang
>
>
> On Fri, Aug 8, 2014 at 12:35 PM, Roman Iakovlev <roman.iakovlev@live.com>
> wrote:
>
>> Dear all,
>>
>>
>>
>> I'm new to Kafka, and I'm considering using it for a maybe not very usual
>> purpose. I want it to be a backend for data synchronization between a
>> magnitude of devices, which are not always online (mobile and embedded
>> devices). All the synchronized information belong to some user, and can be
>> identified by the user id. There are several data types, and a user can
>> have
>> many entries of each data type coming from many different devices.
>>
>>
>>
>> This solution has to scale up to hundreds of thousands of users, and, as
>> far
>> as I understand, Kafka stores every partition in a single file. I've been
>> thinking about creating a topic for every data type and a separate
>> partition
>> for every user. Amount of data stored by every user is no more than several
>> megabytes over the whole lifetime, because the data stored would be keyed
>> messages, and I'm expecting it to be compacted.
>>
>>
>>
>> So what I'm wondering is, would Kafka be a right approach for such task,
>> and
>> if yes, would this architecture (one topic per data type and one partition
>> per user) scale to specified extent?
>>
>>
>>
>> Thanks,
>>
>> Roman.
>>
>>
>
>
> --
> -- Guozhang

Mime
View raw message