flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4228) YARN artifact upload does not work with S3AFileSystem
Date Thu, 02 Nov 2017 20:58:01 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236588#comment-16236588
] 

ASF GitHub Bot commented on FLINK-4228:
---------------------------------------

Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4939#discussion_r148656918
  
    --- Diff: flink-filesystems/flink-s3-fs-hadoop/src/test/java/org/apache/flink/fs/s3hadoop/HadoopS3FileSystemITCase.java
---
    @@ -57,11 +62,52 @@
     	private static final String ACCESS_KEY = System.getenv("ARTIFACTS_AWS_ACCESS_KEY");
     	private static final String SECRET_KEY = System.getenv("ARTIFACTS_AWS_SECRET_KEY");
     
    +	@Rule
    +	public TemporaryFolder tempFolder = new TemporaryFolder();
    +
     	@BeforeClass
    -	public static void checkIfCredentialsArePresent() {
    +	public static void checkCredentialsAndSetup() throws IOException {
    +		// check whether credentials exist
     		Assume.assumeTrue("AWS S3 bucket not configured, skipping test...", BUCKET != null);
     		Assume.assumeTrue("AWS S3 access key not configured, skipping test...", ACCESS_KEY
!= null);
     		Assume.assumeTrue("AWS S3 secret key not configured, skipping test...", SECRET_KEY
!= null);
    +
    +		// initialize configuration with valid credentials
    --- End diff --
    
    I would suggest to move this out of the "setup" method into the actual test.
    The setup logic is not shared (all other test methods don't assume that setup) and it
also assumes existence of functionality that is tested in other test methods..


> YARN artifact upload does not work with S3AFileSystem
> -----------------------------------------------------
>
>                 Key: FLINK-4228
>                 URL: https://issues.apache.org/jira/browse/FLINK-4228
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Ufuk Celebi
>            Priority: Blocker
>             Fix For: 1.4.0
>
>
> The issue now is exclusive to running on YARN with s3a:// as your configured FileSystem.
If so, the Flink session will fail on staging itself because it tries to copy the flink/lib
directory to S3 and the S3aFileSystem does not support recursive copy.
> h2. Old Issue
> Using the {{RocksDBStateBackend}} with semi-async snapshots (current default) leads to
an Exception when uploading the snapshot to S3 when using the {{S3AFileSystem}}.
> {code}
> AsynchronousException{com.amazonaws.AmazonClientException: Unable to calculate MD5 hash:
/var/folders/_c/5tc5q5q55qjcjtqwlwvwd1m00000gn/T/flink-io-5640e9f1-3ea4-4a0f-b4d9-3ce9fbd98d8a/7c6e745df2dddc6eb70def1240779e44/StreamFlatMap_3_0/dummy_state/47daaf2a-150c-4208-aa4b-409927e9e5b7/local-chk-2886
(Is a directory)}
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointThread.run(StreamTask.java:870)
> Caused by: com.amazonaws.AmazonClientException: Unable to calculate MD5 hash: /var/folders/_c/5tc5q5q55qjcjtqwlwvwd1m00000gn/T/flink-io-5640e9f1-3ea4-4a0f-b4d9-3ce9fbd98d8a/7c6e745df2dddc6eb70def1240779e44/StreamFlatMap_3_0/dummy_state/47daaf2a-150c-4208-aa4b-409927e9e5b7/local-chk-2886
(Is a directory)
> 	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1298)
> 	at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:108)
> 	at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:100)
> 	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.upload(UploadMonitor.java:192)
> 	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:150)
> 	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException: /var/folders/_c/5tc5q5q55qjcjtqwlwvwd1m00000gn/T/flink-io-5640e9f1-3ea4-4a0f-b4d9-3ce9fbd98d8a/7c6e745df2dddc6eb70def1240779e44/StreamFlatMap_3_0/dummy_state/47daaf2a-150c-4208-aa4b-409927e9e5b7/local-chk-2886
(Is a directory)
> 	at java.io.FileInputStream.open0(Native Method)
> 	at java.io.FileInputStream.open(FileInputStream.java:195)
> 	at java.io.FileInputStream.<init>(FileInputStream.java:138)
> 	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1294)
> 	... 9 more
> {code}
> Running with S3NFileSystem, the error does not occur. The problem might be due to {{HDFSCopyToLocal}}
assuming that sub-folders are going to be created automatically. We might need to manually
create folders and copy only actual files for {{S3AFileSystem}}. More investigation is required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message