sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Kumar <nk94.nitinku...@gmail.com>
Subject Network resilience of sqoop 1.4.6
Date Mon, 25 Jan 2016 06:30:05 GMT
I am using apache sqoop 1.4.6 (distributed with HortonWorks HDP 2.3
package) to import and export data between rdbms systems and hdfs. I have
to deploy this in a production environment and was wondering about the
network resilience of sqoop.

Say I'm done with about 90% of the import/export job and there is a network
failure between the rdbms system and my hadoop cluster. Since sqoop
internally executes a map/reduce job for this I'm guessing the job will
fail completely and require a manual restart. In this regard I have the
following questions

   1. Does sqoop perform a clean up of the already imported/exported data?
   2. Does sqoop automatically restart the job in the case of network
   3. If a manual clean up and restart is required, what other technology
   alongside sqoop do people generally use to achieve network resilience?
   4. Is there a different version of sqoop that offers this feature?

Your answers and suggestions would highly appreciated.


View raw message