kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: Architecture: amount of partitions
Date Fri, 08 Aug 2014 20:38:49 GMT
Hi Roman,

Current Kafka messaging guarantee is at-least once, and we are working on
transactional messaging features to make it exactly once. We are expecting
it to be used as synchronization/replication layer for some storage systems
as your use case after that.

As for your design, since you will probably have a lot of users and each
user's data is small, you will end up with many small files on Kafka. If
all you want is order preserving per user, you can probably just use
keyed-messages with key as the user id, by that all messages with the same
key will end up into the same partition and hence consumed by the same
consumer client. With that you only need a fixed small number of partitions.

Guozhang


On Fri, Aug 8, 2014 at 12:35 PM, Roman Iakovlev <roman.iakovlev@live.com>
wrote:

> Dear all,
>
>
>
> I'm new to Kafka, and I'm considering using it for a maybe not very usual
> purpose. I want it to be a backend for data synchronization between a
> magnitude of devices, which are not always online (mobile and embedded
> devices). All the synchronized information belong to some user, and can be
> identified by the user id. There are several data types, and a user can
> have
> many entries of each data type coming from many different devices.
>
>
>
> This solution has to scale up to hundreds of thousands of users, and, as
> far
> as I understand, Kafka stores every partition in a single file. I've been
> thinking about creating a topic for every data type and a separate
> partition
> for every user. Amount of data stored by every user is no more than several
> megabytes over the whole lifetime, because the data stored would be keyed
> messages, and I'm expecting it to be compacted.
>
>
>
> So what I'm wondering is, would Kafka be a right approach for such task,
> and
> if yes, would this architecture (one topic per data type and one partition
> per user) scale to specified extent?
>
>
>
> Thanks,
>
> Roman.
>
>


-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message