flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] tillrohrmann commented on issue #6684: [FLINK-10205] Batch Job: InputSplit Fault tolerant for DataSource…
Date Wed, 17 Oct 2018 14:55:19 GMT
tillrohrmann commented on issue #6684:     [FLINK-10205] Batch Job: InputSplit Fault tolerant
for DataSource…
URL: https://github.com/apache/flink/pull/6684#issuecomment-430661796
 
 
   Alright, now the problem is a bit clearer to me. The underlying problem is that the `InputSplitAssigner's`
semantics in case of a failover are not well defined. This is mainly due to the fact that
Flink evolved over time.
   
   The general idea of the `InputSplitAssigner` is to lazily assign work to sources which
have completely consumed their current `InputSplit`. The order in which this happens should
not affect the correctness of the result.
   
   If you say that in case of a recovery the exact same `InputSplit` assignment needs to happen
again, then I think it must be because our sources have some kind of state. Otherwise, it
should not matter which source task completes the `InputSplit`, right? If this is correct,
then we would run into the same problem if a JM failure happens, because we would lose all
`InputSplit` assignment information which is stored on the JM. So stateful sources with `InputSplits`
don't work at the moment (in the general case).
   
   If we assume that our sources are stateless, then simply returning the input splits to
the assigner and letting the next idling task take it should work. In your example of the
infinite stream which is initialized via the `InputSplits` there would be no other task competing
for the `InputSplit` of a failed task because by definition they never finish their work,
right? If multiple tasks fail, then the mapping might be different after the recovery, but
every task would continue consuming from a single `InputSplit`. I think the problem here is
that you abused the `InputSplitAssigner` for something it is not yet intended to do.
   
   The reason why I'm a bit hesitant here is because I think we do not fully understand yet
what we actually want to have. Moreover, some corner cases not clear to me yet. For example,
why would it be ok for a global failover to change the mapping and not for region failover?
Another example is how to handle the case where we lose a TM and need to downscale. Would
that effectively be a global failover where we redistribute all `InputSplits` (I would think
so). 
   
   Before starting any concrete implementation steps, I think we should properly design this
feature to get it right. A very related topic is actually the new source interface. Depending
on how much we are able to unify batch and streaming, the whole `InputSplit` assignment might
move into a single task (similar to the `ContinuousFileMonitoringSink`) and the assignment
might become part of a checkpoint. That way, we would no longer need to take care of this
on the JM side.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message