flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7266) Don't attempt to delete parent directory on S3
Date Tue, 05 Sep 2017 16:21:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153910#comment-16153910
] 

Steve Loughran commented on FLINK-7266:
---------------------------------------

FWIW, in s3a we create a single delete request to rm all parent paths *and don't bother doing
the existence check*. 

That is, for a file a/b/c.txt, after the file is written in close(), POST a delete list of

/a/
/a/b

It's ~O(1)  for depth and as you don't need to wait for the response, even something you could
being async on.

> Don't attempt to delete parent directory on S3
> ----------------------------------------------
>
>                 Key: FLINK-7266
>                 URL: https://issues.apache.org/jira/browse/FLINK-7266
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.3.1
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>            Priority: Critical
>             Fix For: 1.4.0, 1.3.2
>
>
> Currently, every attempted release of an S3 state object also checks if the "parent directory"
is empty and then tries to delete it.
> Not only is that unnecessary on S3, but it is prohibitively expensive and for example
causes S3 to throttle calls by the JobManager on checkpoint cleanup.
> The {{FileState}} must only attempt parent directory cleanup when operating against real
file systems, not when operating against object stores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message