spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: How to force Spark Kafka Direct to start from the latest offset when the lag is huge in kafka 10?
Date Mon, 21 Aug 2017 16:06:29 GMT
Yes, you can start from specified offsets.  See ConsumerStrategy,
specifically Assign

http://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html#your-own-data-store

On Tue, Aug 15, 2017 at 1:18 PM, SRK <swethakasireddy@gmail.com> wrote:
> Hi,
>
> How to force Spark Kafka Direct to start from the latest offset when the lag
> is huge in kafka 10? It seems to be processing from the latest offset stored
> for a group id. One way to do this is to change the group id. But it would
> mean that each time that we need to process the job from the latest offset
> we have to provide a new group id.
>
> Is there a way to force the job to run from the latest offset in case we
> need to and still use the same group id?
>
> Thanks!
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-force-Spark-Kafka-Direct-to-start-from-the-latest-offset-when-the-lag-is-huge-in-kafka-10-tp29071.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message