flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elias Levy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7266) Don't attempt to delete parent directory on S3
Date Sat, 16 Sep 2017 00:35:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168733#comment-16168733
] 

Elias Levy commented on FLINK-7266:
-----------------------------------

I am curious what the state of this is.  It is still a problem on 1.3.2, making use of S3
with the file system state backend very imprudent in production.  You end up with thousands
of empty "directories" in S3 for the checkpoints

{code}
$ $ sudo aws s3 ls --recursive s3://bucket/flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/

2017-09-15 23:03:15          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-1/
2017-09-15 23:04:15          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-10/
2017-09-15 23:14:07          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-100/
2017-09-15 23:14:14          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-101/
2017-09-15 23:14:20          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-102/
2017-09-15 23:15:12          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-103/
2017-09-15 23:15:18          0 flink/checkpoints/58c7604fbc543b6df75b62601a9b4c9d/chk-104/
...
{code}

> Don't attempt to delete parent directory on S3
> ----------------------------------------------
>
>                 Key: FLINK-7266
>                 URL: https://issues.apache.org/jira/browse/FLINK-7266
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.3.1
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>            Priority: Critical
>             Fix For: 1.4.0, 1.3.2
>
>
> Currently, every attempted release of an S3 state object also checks if the "parent directory"
is empty and then tries to delete it.
> Not only is that unnecessary on S3, but it is prohibitively expensive and for example
causes S3 to throttle calls by the JobManager on checkpoint cleanup.
> The {{FileState}} must only attempt parent directory cleanup when operating against real
file systems, not when operating against object stores.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message