flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5706) Implement Flink's own S3 filesystem
Date Sat, 04 Mar 2017 18:26:45 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895808#comment-15895808

Steve Loughran commented on FLINK-5706:

I should add that my current stance with using S3 as a direct destination of work is "don't".
It will work at small scale, usually, but as well as the commit delays, any kind of failure
causes problems. "Direct" committers like the one added to Spark are dangerous for that reason,
and, due to listing inconsistency, you can't safely chain along work from one query to another.

> Implement Flink's own S3 filesystem
> -----------------------------------
>                 Key: FLINK-5706
>                 URL: https://issues.apache.org/jira/browse/FLINK-5706
>             Project: Flink
>          Issue Type: New Feature
>          Components: filesystem-connector
>            Reporter: Stephan Ewen
> As part of the effort to make Flink completely independent from Hadoop, Flink needs its
own S3 filesystem implementation. Currently Flink relies on Hadoop's S3a and S3n file systems.
> An own S3 file system can be implemented using the AWS SDK. As the basis of the implementation,
the Hadoop File System can be used (Apache Licensed, should be okay to reuse some code as
long as we do a proper attribution).

This message was sent by Atlassian JIRA

View raw message