flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] tillrohrmann commented on issue #6684: [FLINK-10205] Batch Job: InputSplit Fault tolerant for DataSource…
Date Wed, 17 Oct 2018 06:42:02 GMT
tillrohrmann commented on issue #6684:     [FLINK-10205] Batch Job: InputSplit Fault tolerant
for DataSource…
URL: https://github.com/apache/flink/pull/6684#issuecomment-430508466
 
 
   Before moving forward I'd like to understand why it is strictly necessary that the failed
tasks reprocesses the same set of input splits. Is it because streaming sources can have state
which they would use to filter out already processed splits? 
   
   In the batch case, this should not be a problem because it should not matter which tasks
processes which input split. If a failure occurs and some other task takes over the failed
input splits, it would as if this task had processed these input splits from the very beginning.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message