flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aljoscha Krettek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7266) Don't attempt to delete parent directory on S3
Date Wed, 06 Sep 2017 10:24:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155151#comment-16155151

Aljoscha Krettek commented on FLINK-7266:

I think that's only part of the problem because Flink must check on its own whether the directory
is empty before we can delete it.

The basic problem is that each state handle is being cleaned up individually. If we had global
knowledge that all state handles actually reside in on base directory then we could shoot
of an asynchronous command that deletes that whole sub-directory. (Which might still be horribly
slow on S3 and not solve the problem at all.)

> Don't attempt to delete parent directory on S3
> ----------------------------------------------
>                 Key: FLINK-7266
>                 URL: https://issues.apache.org/jira/browse/FLINK-7266
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.3.1
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>            Priority: Critical
>             Fix For: 1.4.0, 1.3.2
> Currently, every attempted release of an S3 state object also checks if the "parent directory"
is empty and then tries to delete it.
> Not only is that unnecessary on S3, but it is prohibitively expensive and for example
causes S3 to throttle calls by the JobManager on checkpoint cleanup.
> The {{FileState}} must only attempt parent directory cleanup when operating against real
file systems, not when operating against object stores.

This message was sent by Atlassian JIRA

View raw message