flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From " Mario Georgiev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-12172) Flink fails to close pending BucketingSink
Date Tue, 16 Apr 2019 08:59:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818792#comment-16818792
] 

 Mario Georgiev commented on FLINK-12172:
-----------------------------------------

Hello,

Surely but does Flink has in built support for filtering DataSet/Stream to not read .pending
files and handle automatically .valid-length?
It is indeed very strange to, how do you handle those files? Do you disregard completely .pending
files? That would make the window computation invalid surely, but this means you have to disregard
the files that were marked as finished as well and were part of the windowed computation? 

Could anyone clarify on these questions?

> Flink fails to close pending BucketingSink
> ------------------------------------------
>
>                 Key: FLINK-12172
>                 URL: https://issues.apache.org/jira/browse/FLINK-12172
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem
>    Affects Versions: 1.7.2
>            Reporter:  Mario Georgiev
>            Priority: Major
>
> Hello,
> The problem is if you have a BucketingSink, the following case may occur :
> Let's say you have a 2019-04-12–12 bucket created with several files inside which are
pending/finished
>  You create a savepoint and shut down the job
>  After an hour for instance you start the job from the savepoint and a new bucket is
created, 2019-04-16 for instance. 
>  The problem is that the .pending ones from the old buckets seem to never be moved to
finished state if there is a new hourly bucket created.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message