storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mitchell Rathbun (BLOOMBERG/ 731 LEX)" <>
Subject Re: Seek in KafkaSpout
Date Fri, 29 Sep 2017 00:01:41 GMT
Basically we are using Zookeeper to coordinate between a producer and consumer. When the consumer
comes up, it needs a recap from the producer. The producer sends this recap to the consumer
through Kafka in chunks. Ideally we wanted the consumer to be able to jump back to the start
of the last recap in the queue if the producer is down and the last recap was recent. I think
we have come up with some other ways around this that don't rely on "seek" functionality,
but was just wondering if anyone else had done something similar already. It seems that the
new implementation you mentioned would provide this functionality.

Subject: Re: Seek in KafkaSpout

I'm curious to your use case around this?  It seems odd to need to adjust it on the fly while
a topology is running, or I've misunderstood you!

If you store your consumer state in Zookeeper, you CAN adjust it between topology deploys
by manually modifying the stored state, and I've done this to deal w/ maintenance or service
issues to roll back to a specific point in time.  Unsure if you're able to do this when consumer
state is stored within Kafka itself.

As a side note, I've been toying with a Kafka spout implementation that allows dynamically
consuming arbitrary ranges from topics that is to be open sourced here soon.


On Fri, Sep 29, 2017 at 8:06 AM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) <>

Looking through the documentation, it seems that KafkaSpout does not expose any way to set
the offset the spout reads from after the initial poll. This functionality is supported in
KafkaConsumer through the seek() method. Am I correct that this isn't supported? Has anyone
found a way to mimic the behavior of seek() with KafkaSpout?

View raw message