kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: KTable and Rebalance Operations
Date Tue, 02 Aug 2016 15:50:27 GMT
Hi David,

on startup of the second application instance, the KTable is effectively
partitioned into two distinct partial KTables, each holding the
key-valus pairs for their corresponding assigned partitions.

Thus, your "lookups" on each instance, can only access the key-value
pairs for the set of keys assigned to each instance. There is no
replication of the whole KTable to both instances happening.

We are aware, that a global view over the whole KTable (ie, all local
KTable partitions over all running applications instances) is a nice
feature. There is already a KIP in place and we hope to release this
feature, soon:

Have look here for QA KIP-67
https://cwiki.apache.org/confluence/display/KAFKA/KIP-67%3A+Queryable+state+for+Kafka+Streams


-Matthias


On 08/02/2016 05:47 PM, David Garcia wrote:
> Hello, I’ve googled around for this, but haven’t had any luck.  Based upon this:
http://docs.confluent.io/3.0.0/streams/architecture.html#state  KTables are local to instances.
 An instance will process one or more partitions from one or more topics.  How does Kstreams/Ktables
handle the following situation?
> 
> A single application instance is processing 4 partitions from a topic.  The application
is using a Ktable.  Each event triggers lookups in the KTable.  Now, a new application instance
is started.  This triggers a rebalancing of the partitions.  2 partitions originally processed
by the first instance migrate to the new instance.  What happens with the KTable?  Is the
entire table “migrated” also?  This would be nice because lookups (in the first instance)
triggered by particular events should be identical to lookups (in the second instance) triggered
by those same events.
> 
> -David
> 


Mime
View raw message