spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunita Arvind <>
Subject Change the owner of hdfs file being saved
Date Thu, 02 Nov 2017 16:35:41 GMT
Hello Experts,

I am required to use a specific user id to save files on a remote hdfs
cluster. Remote in the sense, spark jobs run on EMR and write to a CDH
cluster. Hence I cannot change the hdfs-site.xml etc to point to the
destination cluster. As a result I am using webhdfs to save the files into

There are few challenges I have with this approach
1. I cannot use nameservice of the namenode and have to specify the IP
address of the active namenode, which is risky when there is a failover

2. I cannot change the owner/group of the file being written by spark. I
see no option to provide owner for files being written (

3. Using jdbc such that I can specify the user name and password would mean
I will end up creating managed tables only. This is not acceptable for our

Is there a way to change the owner of files written by Spark?


View raw message