spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akshay Bhardwaj <>
Subject Re: Spark Streaming - Proeblem to manage offset Kafka and starts from the beginning.
Date Wed, 27 Feb 2019 06:38:18 GMT
Hi Guillermo,

What was the interval in between restarting the spark job? As a feature in
Kafka, a broker deleted offsets for a consumer group after inactivity of 24
In such a case, the newly started spark streaming job will read offsets
from beginning for the same groupId.

Akshay Bhardwaj

On Thu, Feb 21, 2019 at 9:08 PM Gabor Somogyi <>

> From the info you've provided not much to say.
> Maybe you could collect sample app, logs etc, open a jira and we can take
> a deeper look at it...
> BR,
> G
> On Thu, Feb 21, 2019 at 4:14 PM Guillermo Ortiz <>
> wrote:
>> I' working with Spark Streaming 2.0.2 and Kafka 1.0.0 using Direct Stream
>> as connector. I consume data from Kafka and autosave the offsets.
>> I can see Spark doing commits in the logs of the last offsets processed,
>> Sometimes I have restarted spark and it starts from the beginning, when I'm
>> using the same groupId.
>> Why could it happen? it only happen rarely.

View raw message