kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: Consumer State Description in design.html
Date Fri, 13 Apr 2012 17:39:19 GMT
Ed,

The design page only describes how the high level consumer (which most
people use) works. The high level consumer currently doesn't expose
offsets. Hadoop uses the low level consumer (SimpleConsumer), which is not
described. We can have a wiki describing it and put your content there.

Thanks,

Jun

On Fri, Apr 13, 2012 at 10:24 AM, Edward Smith <esmith@stardotstar.org>wrote:

> Sorry.... here it is with more clarity:
>
> Basically I'm adding to the beginning of the 2nd section titled "Consumer
> State"
>
> ----------------------------------------
> <h3>Consumer State</h3> (the second heading like this in the file)
> <p>
> In Kafka, the consumers are responsible for maintaining state
> information on what has been  consumed.  The core Kafka consumers
> write their state data to zookeeper.
> </p>
> <p>
> However, it may be beneficial for consumers to write state data into
> the same datastore where they are writing the results of their
> processing.  For example, the consumer may simply be entering some
> aggregate value into a centralized......
> ..
> (rest of section remains the same from here)
> ..
> </p>
> ------------------------------------------
>
>
> On Fri, Apr 13, 2012 at 1:16 PM, Jun Rao <junrao@gmail.com> wrote:
> > Ed,
> >
> > I don't see the change you want to make. Apache mailing list doesn't take
> > attachments. If you have attachments, the easiest way is probably to
> attach
> > that to a jira.
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Apr 13, 2012 at 10:04 AM, Edward Smith <esmith@stardotstar.org
> >wrote:
> >
> >> I didn't want to open up a bug unless there was some concurrence on
> >> this.   Please review the change below and see if I'm just
> >> misunderstanding things or not.  This paragraph in the doc took me a
> >> long time to digest because it was describing the contrib/hadoop
> >> consumer and not how simpleconsumer or consoleconsumer work:
> >>
> >> Consumer State (the second heading like this in the file)
> >>
> >> In Kafka, the consumers are responsible for maintaining state
> >> information on what has been consumed.  The core Kafka consumers write
> >> their state data to zookeeper.
> >>
> >> However, it may be beneficial for consumers to write state data into
> >> the same datastore where they are writing the results of their
> >> processing.  For example, the consumer may simply be entering some
> >> aggregate value into a centralized...... (rest of section remains the
> >> same from here)
> >>
> >> Ed
> >>
> >> On Fri, Apr 13, 2012 at 12:02 PM, Jun Rao <junrao@gmail.com> wrote:
> >> > Currently, as you are iterating messages returned by SimpleConsumer,
> you
> >> > also get the offset for the next message. In the map, you can just run
> >> for
> >> > 30 mins and save the next offset for the next run.
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >> > On Fri, Apr 13, 2012 at 1:01 AM, R S <mypostboxat@gmail.com> wrote:
> >> >
> >> >> Hi ,
> >> >>
> >> >> I looked at hadoop-consumer , which fetches data directly from the
> kafka
> >> >> broker . But from what i understand it is based on min and max offset
> >> and
> >> >> map task complete once they reach the maximum offset for a given
> topic .
> >> >>
> >> >> In our use case we would not know about the max offset before hand.
> >> Instead
> >> >> we want map to keep reading data from a min offset and roll over
> every
> >> 30
> >> >> mins . At 30th min we would again generate the offsets which would
be
> >> used
> >> >> for the next run.
> >> >>
> >> >> any suggestions would be helpful .
> >> >>
> >> >> regards,
> >> >> rks
> >> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message