flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Holzemer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-909) Pitfall due to additional superstep after the iteration has stopped
Date Mon, 23 Jun 2014 10:31:25 GMT

    [ https://issues.apache.org/jira/browse/FLINK-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040602#comment-14040602

Markus Holzemer commented on FLINK-909:

I also stumbled over this issue a few times. Since I am currently in the process of refactoring
the iterations runtime I will have a look at this issue.
It should be possible to add a barrier at the start of each superstep and wait for an explicit
OK message from the iteration head task (that is managing a single iteration instance at one
taskmanager) before the next superstep can start.

> Pitfall due to additional superstep after the iteration has stopped
> -------------------------------------------------------------------
>                 Key: FLINK-909
>                 URL: https://issues.apache.org/jira/browse/FLINK-909
>             Project: Flink
>          Issue Type: Bug
>            Reporter: GitHub Import
>              Labels: github-import
>             Fix For: pre-apache
> Currently, after an iteration has exceeded the maximum number of iterations, all tasks
are started again for an additional superstep during which they are stopped. This works if
a tasks only waits for dynamic input. However, in the case where one has a task, e.g. a coGroup
operation, which gets dynamic and static input the execution is not blocked. This can then
lead to erroneous behaviour which the user is not aware of.
> I had this problem implementing ALS. Here one has a loop which gets as dynamic input
matrix columns and as static input matrix entries. The columns and the entries are used to
construct a matrix which represents a system of linear equations. If the set of columns are
empty, then the matrix is singular and thus not solvable. During the additional superstep
the task won't receive any columns but would still try to solve the now singular matrix.
> It would be good to finish the iteration without initiating this additional superstep.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/909
> Created by: [tillrohrmann|https://github.com/tillrohrmann]
> Labels: 
> Created at: Thu Jun 05 17:50:17 CEST 2014
> State: open

This message was sent by Atlassian JIRA

View raw message