beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Halperin (JIRA)" <>
Subject [jira] [Commented] (BEAM-1775) fix issue of start_from_previous_offset in KafkaIO
Date Wed, 22 Mar 2017 00:25:41 GMT


Daniel Halperin commented on BEAM-1775:

Hi [~mingmxu],

* Re: the question from Jins George -- the right answer here is what we're already doing --
use the KafkaCheckpointMark. In Beam, the runner maintains the state and not the external
system. Beam runners are responsible for maintaining the checkpoint marks, and for redoing
all uncommitted (uncheckpointed) work. If a user disables checkpointing, then they are explicitly
opting into "redo all work" on restart.

* If checkpointing is enabled but the KafkaCheckpointMark is not being provided, then I'm
inclined to agree with [~amitsela] that there may simply be a bug in the FlinkRunner.

However, presumably if the Kafka source is initially configured to "read from latest offset",
when it restarts with no checkpoint this will automatically go find the latest offset. Is
that not working for you?

> fix issue of start_from_previous_offset in KafkaIO
> --------------------------------------------------
>                 Key: BEAM-1775
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Xu Mingmin
>            Assignee: Xu Mingmin
> Jins George via 
> 5:50 PM (15 hours ago)
> to user
> Hello,
> I am writing a Beam pipeline(streaming) with Flink runner to consume data from Kafka
and apply some transformations and persist to Hbase.
> If I restart the application ( due to failure/manual restart), consumer does not resume
from the offset where it was prior to restart. It always resume from the latest offset.
> If I enable Flink checkpionting with hdfs state back-end, system appears to be resuming
from the earliest offset
> Is there a recommended way to resume from the offset where it was stopped ?
> Thanks,
> Jins George

This message was sent by Atlassian JIRA

View raw message