kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Consumer State Description in design.html
Date Fri, 13 Apr 2012 17:16:31 GMT
Ed,

I don't see the change you want to make. Apache mailing list doesn't take
attachments. If you have attachments, the easiest way is probably to attach
that to a jira.

Thanks,

Jun

On Fri, Apr 13, 2012 at 10:04 AM, Edward Smith <esmith@stardotstar.org>wrote:

> I didn't want to open up a bug unless there was some concurrence on
> this.   Please review the change below and see if I'm just
> misunderstanding things or not.  This paragraph in the doc took me a
> long time to digest because it was describing the contrib/hadoop
> consumer and not how simpleconsumer or consoleconsumer work:
>
> Consumer State (the second heading like this in the file)
>
> In Kafka, the consumers are responsible for maintaining state
> information on what has been consumed.  The core Kafka consumers write
> their state data to zookeeper.
>
> However, it may be beneficial for consumers to write state data into
> the same datastore where they are writing the results of their
> processing.  For example, the consumer may simply be entering some
> aggregate value into a centralized...... (rest of section remains the
> same from here)
>
> Ed
>
> On Fri, Apr 13, 2012 at 12:02 PM, Jun Rao <junrao@gmail.com> wrote:
> > Currently, as you are iterating messages returned by SimpleConsumer, you
> > also get the offset for the next message. In the map, you can just run
> for
> > 30 mins and save the next offset for the next run.
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Apr 13, 2012 at 1:01 AM, R S <mypostboxat@gmail.com> wrote:
> >
> >> Hi ,
> >>
> >> I looked at hadoop-consumer , which fetches data directly from the kafka
> >> broker . But from what i understand it is based on min and max offset
> and
> >> map task complete once they reach the maximum offset for a given topic .
> >>
> >> In our use case we would not know about the max offset before hand.
> Instead
> >> we want map to keep reading data from a min offset and roll over every
> 30
> >> mins . At 30th min we would again generate the offsets which would be
> used
> >> for the next run.
> >>
> >> any suggestions would be helpful .
> >>
> >> regards,
> >> rks
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message