sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: Issue while running sqoop script parallel
Date Sun, 20 Dec 2015 19:22:44 GMT
Hello Saravanan,

HDFS implements a single-writer model for its files, so if 2 clients concurrently try to open
the same file path for write or append, then one of them will receive an error.  It looks
to me like tasks from 2 different job submissions collided on the same path.  I think you're
on the right track investigating why this application used the same temp directory.  Is the
temp directory something that is controlled by the parameters that you pass to your script?
 Do you know how the "055830" gets determined in this example?

--Chris Nauroth

From: Saravanan Nagarajan <saravanan.nagarajan303@gmail.com<mailto:saravanan.nagarajan303@gmail.com>>
Date: Thursday, December 17, 2015 at 9:18 PM
To: "user@sqoop.apache.org<mailto:user@sqoop.apache.org>" <user@sqoop.apache.org<mailto:user@sqoop.apache.org>>,
"user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Issue while running sqoop script parallel


Need your expert guidance to resolve a sqoop script Error. We are using sqoop & invoking
TDCH(Teradata Hadoop Connector) to archive data from Teradata to HADOOP Hive Tables .

We have created a generic Sqoop script which accepts source DB, View Name,target name as input
parameter & loads into Hive Tables. If I try to parallely invoke the same script with
different set of parameters both instances  of the scripts are failing with below error.


Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
Failed to create file [/user/svc_it/temp_055830/part-m-00000] for [DFSClient_attempt_1439568235974_1009_m_000000_1_909694314_1]
for client [], because this file is already being created by [DFSClient_attempt_1439568235974_1010_m_000000_0_-997224822_1]
on []


The issue in the Map step & its trying to write the file to HDFS disk. Looks like one
instance is trying to overwrite the files created by other instance as the Temp folder being
created by mapper is with the same Name (/user/svc_it/temp_05583 )

Please let me know how to fix this issue.

NS Saravanan

View raw message