kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Gustafson <ja...@confluent.io>
Subject Re: Best way to commit offset on demand
Date Tue, 05 Jan 2016 03:17:08 GMT
Ah, that makes sense if you have to wait to join the group. I think you
could technically commit offsets without joining if you were sure that the
group was dead (i.e. all consumers had either left the group cleanly or
their session timeout expired). But if there are still active members, then
yeah, you have to join the group. Clearly you have to be a little careful
in this case if an active consumer is still trying to read data (it won't
necessarily see the fresh offset commits and could even overwrite them),
but I assume you're handling this.

Creating a new instance each time you want to do this seems viable to me
(and likely how we'd end up implementing the command line utility anyway).
The overhead is just a couple TCP connections. It's probably as good (or as
bad) as any other approach. The join latency seems unavoidable if you can't
be sure that the group is dead since we do not allow non-group members to
commit offsets by design. Any tool we write will be up against the same
restriction. We might be able to think of a way to bypass it, but that
sounds dangerous.

Out of curiosity, what's the advantage in your use case to setting offsets
out-of-band? I would probably consider options for moving it into the
consumer process.

-Jason

On Mon, Jan 4, 2016 at 6:20 PM, tao xiao <xiaotao183@gmail.com> wrote:

> Jason,
>
> It normally takes a couple of seconds sometimes it takes longer to join a
> group if the consumer didn't shutdown gracefully previously.
>
> My use case is to have a command/tool to call to reset offset for a list of
> partitions and a particular consumer group before the consumer is started
> or wait until the offset reaches a given number before the consumer can be
> closed. I think https://issues.apache.org/jira/browse/KAFKA-3059 fits my
> use case. But for now I need to find out a workaround until this feature is
> implemented.
>
> For offset reset one way I can think of is to create a consumer with the
> same group id that I want to reset the offset for. Then commit the offset
> for the particular partitions and close the consumer. Is this solution
> viable?
>
> On Tue, 5 Jan 2016 at 09:56 Jason Gustafson <jason@confluent.io> wrote:
>
> > Hey Tao,
> >
> > Interesting that you're seeing a lot of overhead constructing the new
> > consumer instance each time. Granted it does have to fetch topic metadata
> > and lookup the coordinator, but I wouldn't have expected that to be a big
> > problem. How long is it typically taking?
> >
> > -Jason
> >
> > On Mon, Jan 4, 2016 at 3:26 AM, Marko Bonaći <marko.bonaci@sematext.com>
> > wrote:
> >
> > > How are you consuming those topics?
> > >
> > > IF: I assume you have a consumer, so why not commit from within that
> > > consumer, after you process the message (whatever "process" means to
> > you).
> > >
> > > ELSE: couldn't you have a dedicated consumer for offset commit requests
> > > that you don't shut down between requests?
> > >
> > > FINALLY: tell us more about your use case.
> > >
> > > Marko Bonaći
> > > Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> > > Solr & Elasticsearch Support
> > > Sematext <http://sematext.com/> | Contact
> > > <http://sematext.com/about/contact.html>
> > >
> > > On Mon, Jan 4, 2016 at 12:18 PM, tao xiao <xiaotao183@gmail.com>
> wrote:
> > >
> > > > Hi team,
> > > >
> > > > I have a scenario where I want to write new offset for a list of
> topics
> > > on
> > > > demand. The list of topics is unknown until runtime and the interval
> > > > between each commit is undetermined. what would be the best way to do
> > so?
> > > >
> > > > One way I can think of is to create a new consumer and call
> > > > commitSync(offsets) every time I want to commit. But it seems taking
> > too
> > > > much time to bootstrap the consumer. is there a lighter way to
> achieve
> > > > this?
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message