beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From François Wagner (JIRA) <j...@apache.org>
Subject [jira] [Commented] (BEAM-2429) Conflicting filesystems with used of HadoopFileSystem
Date Mon, 12 Jun 2017 12:43:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16046511#comment-16046511
] 

François Wagner commented on BEAM-2429:
---------------------------------------

Hi Flavio, Thanks for your input, it worked right away when I've added "fs.defaultFS". Maybe
one could add this somewhere in the documentation as it's not obvious that we have to add
this option to handle "hdfs://" URI, moreover that was not the case with the previous version
of HdfsIO. Thanks a lot for your help. François

> Conflicting filesystems with used of HadoopFileSystem
> -----------------------------------------------------
>
>                 Key: BEAM-2429
>                 URL: https://issues.apache.org/jira/browse/BEAM-2429
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 2.0.0
>            Reporter: François Wagner
>            Assignee: Flavio Fiszman
>
> I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks like HadoopFileSystem
is registring itself under the `file` schema (https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
hence the following Exception is thrown when trying to register HadoopFileSystem.
> java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: [org.apache.beam.sdk.io.LocalFileSystem,
org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
> 	at org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)
> What is the correct way to handle `hdfs` url out of the box with TextIO & AvroIO
?
> {code:java}
>     String[] args = new String[]{
>         "--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": \"true\"}]"};
>     HadoopFileSystemOptions options = PipelineOptionsFactory
>         .fromArgs(args)
>         .withValidation()
>         .as(HadoopFileSystemOptions.class);
>     Pipeline pipeline = Pipeline.create(options); 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message