I made the data load work somehow but I don’t think it’s the right way ! Here is what I tried (sorry for the long email)
1. Hadoop 2.5.1 + sqoop-1.4.5.bin__hadoop-2.0.4-alpha : This failed with the following error. I did see same issue asked on stackoverflow but no fix. This is most likely due to some jar files mismatch.
Note: /tmp/sqoop-sas/compile/1ee9265317c9d91a077060f857c3e726/TEMP_ADDRESS.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/11/24 23:05:43 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-sas/compile/1ee9265317c9d91a077060f857c3e726/TEMP_ADDRESS.jar
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/InputFormat
at java.lang.ClassLoader.defineClass1(Native Method)
2. Hadoop 2.5.1 + sqoop-1.4.5.bin__hadoop-1.0.0 : This also failed with some Class version mismatch error. So I downloaded the Hadoop_1.2.1 also and specified that as the HADOOP_MAPRED_PATH in the Sqoop configuration (to pick up the hadoop_core.jar). This worked for both HDFS and Hive imports from Oracle J
However there are 2 issues
2.1 : I don’t want to maintain 2 versions of Hadoop
2.2 : The imported file is not actually going on the HDFS that I have setup, instead going on the local file system. For example when I load a file in a Hive table in “hive”, it is visible on 'hdfs://finattr-comp-dev-01:9999/apps/sas/hive/warehouse/<table> whereas with the Sqoop import the file is landing in /apps/sas/hive/warehouse/<table>.