kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tao xiao <xiaotao...@gmail.com>
Subject Re: Best way to commit offset on demand
Date Tue, 05 Jan 2016 03:49:42 GMT
Thanks for the detailed explanation. 'technically commit offsets without
joining group'  I assume it means that I can call assign instead of
subscribe on consumer which bypasses joining process.

The reason we put the reset offset outside of the consumer process is that
we can keep the consumer code as generic as possible since the offset reset
process is not needed for all consumer logics.

On Tue, 5 Jan 2016 at 11:18 Jason Gustafson <jason@confluent.io> wrote:

> Ah, that makes sense if you have to wait to join the group. I think you
> could technically commit offsets without joining if you were sure that the
> group was dead (i.e. all consumers had either left the group cleanly or
> their session timeout expired). But if there are still active members, then
> yeah, you have to join the group. Clearly you have to be a little careful
> in this case if an active consumer is still trying to read data (it won't
> necessarily see the fresh offset commits and could even overwrite them),
> but I assume you're handling this.
>
> Creating a new instance each time you want to do this seems viable to me
> (and likely how we'd end up implementing the command line utility anyway).
> The overhead is just a couple TCP connections. It's probably as good (or as
> bad) as any other approach. The join latency seems unavoidable if you can't
> be sure that the group is dead since we do not allow non-group members to
> commit offsets by design. Any tool we write will be up against the same
> restriction. We might be able to think of a way to bypass it, but that
> sounds dangerous.
>
> Out of curiosity, what's the advantage in your use case to setting offsets
> out-of-band? I would probably consider options for moving it into the
> consumer process.
>
> -Jason
>
> On Mon, Jan 4, 2016 at 6:20 PM, tao xiao <xiaotao183@gmail.com> wrote:
>
> > Jason,
> >
> > It normally takes a couple of seconds sometimes it takes longer to join a
> > group if the consumer didn't shutdown gracefully previously.
> >
> > My use case is to have a command/tool to call to reset offset for a list
> of
> > partitions and a particular consumer group before the consumer is started
> > or wait until the offset reaches a given number before the consumer can
> be
> > closed. I think https://issues.apache.org/jira/browse/KAFKA-3059 fits my
> > use case. But for now I need to find out a workaround until this feature
> is
> > implemented.
> >
> > For offset reset one way I can think of is to create a consumer with the
> > same group id that I want to reset the offset for. Then commit the offset
> > for the particular partitions and close the consumer. Is this solution
> > viable?
> >
> > On Tue, 5 Jan 2016 at 09:56 Jason Gustafson <jason@confluent.io> wrote:
> >
> > > Hey Tao,
> > >
> > > Interesting that you're seeing a lot of overhead constructing the new
> > > consumer instance each time. Granted it does have to fetch topic
> metadata
> > > and lookup the coordinator, but I wouldn't have expected that to be a
> big
> > > problem. How long is it typically taking?
> > >
> > > -Jason
> > >
> > > On Mon, Jan 4, 2016 at 3:26 AM, Marko Bonaći <
> marko.bonaci@sematext.com>
> > > wrote:
> > >
> > > > How are you consuming those topics?
> > > >
> > > > IF: I assume you have a consumer, so why not commit from within that
> > > > consumer, after you process the message (whatever "process" means to
> > > you).
> > > >
> > > > ELSE: couldn't you have a dedicated consumer for offset commit
> requests
> > > > that you don't shut down between requests?
> > > >
> > > > FINALLY: tell us more about your use case.
> > > >
> > > > Marko Bonaći
> > > > Monitoring | Alerting | Anomaly Detection | Centralized Log
> Management
> > > > Solr & Elasticsearch Support
> > > > Sematext <http://sematext.com/> | Contact
> > > > <http://sematext.com/about/contact.html>
> > > >
> > > > On Mon, Jan 4, 2016 at 12:18 PM, tao xiao <xiaotao183@gmail.com>
> > wrote:
> > > >
> > > > > Hi team,
> > > > >
> > > > > I have a scenario where I want to write new offset for a list of
> > topics
> > > > on
> > > > > demand. The list of topics is unknown until runtime and the
> interval
> > > > > between each commit is undetermined. what would be the best way to
> do
> > > so?
> > > > >
> > > > > One way I can think of is to create a new consumer and call
> > > > > commitSync(offsets) every time I want to commit. But it seems
> taking
> > > too
> > > > > much time to bootstrap the consumer. is there a lighter way to
> > achieve
> > > > > this?
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message