beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1517) Garbage collect user state in Flink Runner
Date Tue, 28 Feb 2017 19:36:45 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888732#comment-15888732
] 

Kenneth Knowles commented on BEAM-1517:
---------------------------------------

Event time timers never need to be dropped, since they hold the watermark, but late processing
time timers should just be dropped as they could arrive way after the window expires.

Incidentally, it has been suggested that we accelerate the addition of {{@OnWindowExpiration}}
and make it mandatory whenever any state is used. It seems kind of annoying, but on the other
hand forgetting to set a final timer to flush state is probably data loss most of the time.

> Garbage collect user state in Flink Runner
> ------------------------------------------
>
>                 Key: BEAM-1517
>                 URL: https://issues.apache.org/jira/browse/BEAM-1517
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 0.6.0
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>            Priority: Blocker
>             Fix For: 0.6.0
>
>
> User facing state/timers in Beam are bound to the key/window of the data. Right now,
the Flink Runner does not clean up user state when the watermark passes the GC horizon for
the state associated with a given window.
> Neither {{StateInternals}} nor the Flink state API support discarding state for a whole
namespace (which is the window in this case) so we might have to manually set a GC timer for
each window/key combination, as is done in the {{ReduceFnRunner}}. For this we have to know
all states a user can possibly use, which we can get from the {{DoFn}} signature.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message