spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From swetha kasireddy <swethakasire...@gmail.com>
Subject Re: How to force Spark Kafka Direct to start from the latest offset when the lag is huge in kafka 10?
Date Mon, 21 Aug 2017 23:57:23 GMT
Hi Cody,

I think the Assign is used if we want it to start from a specified offset.
What if we want it to start it from the latest offset with something like
returned by "auto.offset.reset" -> "latest",.


Thanks!

On Mon, Aug 21, 2017 at 9:06 AM, Cody Koeninger <cody@koeninger.org> wrote:

> Yes, you can start from specified offsets.  See ConsumerStrategy,
> specifically Assign
>
> http://spark.apache.org/docs/latest/streaming-kafka-0-10-
> integration.html#your-own-data-store
>
> On Tue, Aug 15, 2017 at 1:18 PM, SRK <swethakasireddy@gmail.com> wrote:
> > Hi,
> >
> > How to force Spark Kafka Direct to start from the latest offset when the
> lag
> > is huge in kafka 10? It seems to be processing from the latest offset
> stored
> > for a group id. One way to do this is to change the group id. But it
> would
> > mean that each time that we need to process the job from the latest
> offset
> > we have to provide a new group id.
> >
> > Is there a way to force the job to run from the latest offset in case we
> > need to and still use the same group id?
> >
> > Thanks!
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/How-to-force-Spark-Kafka-Direct-to-
> start-from-the-latest-offset-when-the-lag-is-huge-in-kafka-10-tp29071.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >
>

Mime
View raw message