beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aljoscha Krettek (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (BEAM-3494) Snapshot state of aggregated data of apache beam project is not maintained in flink's checkpointing
Date Fri, 26 Jan 2018 13:33:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341057#comment-16341057
] 

Aljoscha Krettek edited comment on BEAM-3494 at 1/26/18 1:32 PM:
-----------------------------------------------------------------

How are you enabling checkpointing? Also, could you please reformat your posting to put the
code inside code tags to make it more readable?


was (Author: aljoscha):
How are you enabling checkpointing? Also, could you please reformat your posting to put the
code between code tags, like this
{code:java}
{code}
your code ...
{code}{code}
to make it more readable?

> Snapshot state of aggregated data of apache beam project is not maintained in flink's
checkpointing 
> ----------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-3494
>                 URL: https://issues.apache.org/jira/browse/BEAM-3494
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: suganya
>            Priority: Major
>
> We have a beam project which consumes events from kafka,does a groupby in a time window(5
mins),after window elapses it pushes the events to downstream for merge.This project is deployed
using flink ,we have enabled checkpointing to recover from failed state.
> (windowsize: 5mins , checkpointingInterval: 5mins,state.backend: filesystem)
> Offsets from kafka get checkpointed every 5 mins(checkpointingInterval).Before finishing
the entire DAG(groupBy and merge) , events offsets are getting checkpointed.So incase of any
restart from task-manager ,new task gets started from last successful checkpoint ,but we could'nt
able to get the aggregated snapshot data(data from groupBy task) from the persisted checkpoint.
> Able to retrieve the last successful checkpointed offset from kafka ,but couldnt able
to get last aggregated data till checkpointing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message