sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SQOOP-3234) Using Sqoop API with hive import option fails while loading imported data into hive
Date Sun, 03 Sep 2017 20:09:00 GMT
Bharath created SQOOP-3234:
------------------------------

             Summary: Using Sqoop API with hive import option fails while loading imported
data into hive
                 Key: SQOOP-3234
                 URL: https://issues.apache.org/jira/browse/SQOOP-3234
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.6
            Reporter: Bharath


I am trying to execute Sqoop from a java program leveraging the *SqoopOptions* and *ImportTool*
packages to import a table from a postgres database onto a hive table. Running the sqoop command
from command line works perfectly fine and imports the table into hive. The problem I am facing
with leveraging the Sqoop APIs is that after the map tasks finish and loads data onto hdfs
directory, the step where loading of data into hive managed table happens, hive complains
that the directory "file:/user/hive/warehouse/lineitem" doesn't exist. From the error message,
it is clear that hive is looking for the directory "/user/hive/warehouse/lineitem" on my local
filesystem instead of hdfs even though I provided all the necessary conf files before invoking
sqoop. 

Here is a miniature version of the sqoop program I am using:

{code:java}
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.tool.ImportTool;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;

public class sqoopexperiments {

    protected static Configuration getConfiguration() {

        Configuration conf = new Configuration();

        conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/core-site.xml"));
        conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/yarn-site.xml"));
        conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/hdfs-site.xml"));
        conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/mapred-site.xml"));
        conf.addResource(new Path("/usr/local/Cellar/hive/1.2.1/libexec/conf/hive-site.xml"));

        return conf;
    }

    private static SqoopOptions SqoopOptions = new SqoopOptions(getConfiguration());
    private static final String connectionString = "jdbc:postgresql://127.0.0.1:5432/sales";
    private static final String username = "unifi";
    private static final String password = "unifi";

    private static int executeSqoop() {
        int retCode;
        retCode = new ImportTool().run(SqoopOptions);
        if (retCode != 0) {
            throw new RuntimeException("Sqoop execution failure. Return code : "+Integer.toString(retCode));
        }
        return retCode;
    }

    public static void main(String[] args) {

        SqoopOptions.setConnectString(connectionString);
        SqoopOptions.setUsername(username);
        SqoopOptions.setPassword(password);
        SqoopOptions.setTableName("lineitem");
        SqoopOptions.setTargetDir("/user/unifi/tmp/testsqoop/1");
        SqoopOptions.setHiveImport(true);

        executeSqoop();
    }
}
{code}

Here is the output of my program execution:

{code}
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/bin/java -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:64590,suspend=y,server=n
-Dfile.encoding=UTF-8 -classpath "/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/charsets.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/deploy.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/dnsns.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/localedata.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/nashorn.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunec.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/zipfs.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/javaws.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jce.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jfr.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jfxswt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jsse.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/management-agent.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/plugin.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/resources.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/rt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/ant-javafx.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/dt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/javafx-mx.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/jconsole.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/packager.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/sa-jdi.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/tools.jar:
/Users/bharath/dev/sqoopexperiments/out/production/sqoopexperiments:
/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/hadoop-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/activation-1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/asm-3.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hadoop-auth-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/junit-4.11.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/xz-1.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/Cellar/sqoop/1.4.6/libexec/sqoop-1.4.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/hdfs/hadoop-hdfs-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-api-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-client-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-registry-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.2.jar:/usr/local/Cellar/sqoop/1.4.6/libexec/lib/postgresql-9.3-1102.jdbc4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-core-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-fate-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-start-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-trace-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/activation-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ant-1.9.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ant-launcher-1.9.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/antlr-2.7.7.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/antlr-runtime-3.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/apache-log4j-extras-1.2.17.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/asm-commons-3.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/asm-tree-3.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/avro-1.7.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/bonecp-0.8.0.RELEASE.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-avatica-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-core-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-linq4j-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-beanutils-1.7.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-beanutils-core-1.8.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-cli-1.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-codec-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-collections-3.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-compiler-2.7.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-compress-1.4.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-configuration-1.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-dbcp-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-digester-1.8.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-httpclient-3.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-io-2.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-lang-2.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-logging-1.1.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-math-2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-pool-1.5.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-vfs2-2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-client-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-framework-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-recipes-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-core-3.2.10.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/derby-10.10.2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/eigenbase-properties-1.1.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-annotation_1.0_spec-1.1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-jaspic_1.0_spec-1.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-jta_1.1_spec-1.1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/groovy-all-2.1.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/guava-14.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hamcrest-core-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-accumulo-handler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-ant-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-beeline-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-cli-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-common-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-contrib-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-exec-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-hbase-handler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-hwi-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-jdbc-1.2.1-standalone.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-jdbc-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-metastore-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-serde-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-service-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-0.20S-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-0.23-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-common-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-scheduler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-testutils-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/httpclient-4.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/httpcore-4.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ivy-2.4.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/janino-2.7.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jcommander-1.32.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jdo-api-3.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jetty-all-7.6.0.v20120127.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jetty-all-server-7.6.0.v20120127.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jline-2.12.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/joda-time-2.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jpam-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/json-20090211.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jsr305-3.0.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jta-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/junit-4.11.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/libfb303-0.9.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/libthrift-0.9.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/log4j-1.2.16.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/mail-1.4.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-api-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-provider-svn-commons-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-provider-svnexe-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/netty-3.7.0.Final.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/opencsv-2.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/oro-2.0.8.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/paranamer-2.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/parquet-hadoop-bundle-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/plexus-utils-1.5.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/postgresql-9.3-1102.jdbc4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/regexp-1.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/servlet-api-2.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/snappy-java-1.0.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ST4-4.0.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/stax-api-1.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/stringtemplate-3.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/super-csv-2.2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/tempus-fugit-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/velocity-1.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/xz-1.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/zookeeper-3.4.6.jar:/Applications/IntelliJ
IDEA.app/Contents/lib/idea_rt.jar" sqoopexperiments
Connected to the target VM, address: '127.0.0.1:64590', transport: 'socket'
2017-09-03 12:25:28,072 WARN  [main] sqoop.ConnFactory (ConnFactory.java:loadManagersFromConfDir(273))
- $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2017-09-03 12:25:28,199 INFO  [main] manager.SqlManager (SqlManager.java:initOptionDefaults(98))
- Using default fetchSize of 1000
2017-09-03 12:25:30,085 INFO  [main] tool.CodeGenTool (CodeGenTool.java:generateORM(92)) -
Beginning code generation
2017-09-03 12:25:30,198 INFO  [main] manager.SqlManager (SqlManager.java:execute(757)) - Executing
SQL statement: SELECT t.* FROM "lineitem" AS t LIMIT 1
2017-09-03 12:25:30,232 INFO  [main] orm.CompilationManager (CompilationManager.java:findHadoopJars(85))
- $HADOOP_MAPRED_HOME is not set
Note: /tmp/sqoop-bharath/compile/0df909eb6973527d155c8c591a072c5e/lineitem.java uses or overrides
a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2017-09-03 12:25:31,464 INFO  [main] orm.CompilationManager (CompilationManager.java:jar(330))
- Writing jar file: /tmp/sqoop-bharath/compile/0df909eb6973527d155c8c591a072c5e/lineitem.jar
2017-09-03 12:25:31,471 WARN  [main] manager.PostgresqlManager (PostgresqlManager.java:importTable(119))
- It looks like you are importing from postgresql.
2017-09-03 12:25:31,471 WARN  [main] manager.PostgresqlManager (PostgresqlManager.java:importTable(120))
- This transfer can be faster! Use the --direct
2017-09-03 12:25:31,471 WARN  [main] manager.PostgresqlManager (PostgresqlManager.java:importTable(121))
- option to exercise a postgresql-specific fast path.
2017-09-03 12:25:31,475 WARN  [main] manager.CatalogQueryManager (CatalogQueryManager.java:getPrimaryKey(239))
- The table lineitem contains a multi-column primary key. Sqoop will default to the column
l_orderkey only for this job.
2017-09-03 12:25:31,476 WARN  [main] manager.CatalogQueryManager (CatalogQueryManager.java:getPrimaryKey(239))
- The table lineitem contains a multi-column primary key. Sqoop will default to the column
l_orderkey only for this job.
2017-09-03 12:25:31,476 INFO  [main] mapreduce.ImportJobBase (ImportJobBase.java:runImport(235))
- Beginning import of lineitem
2017-09-03 12:25:31,651 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
- Unable to load native-hadoop library for your platform... using builtin-java classes where
applicable
2017-09-03 12:25:31,680 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173))
- mapred.jar is deprecated. Instead, use mapreduce.job.jar
2017-09-03 12:25:32,243 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173))
- mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2017-09-03 12:25:32,244 WARN  [main] mapreduce.JobBase (JobBase.java:cacheJars(179)) - SQOOP_HOME
is unset. May not be able to find all job dependencies.
2017-09-03 12:25:32,308 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting
to ResourceManager at /0.0.0.0:8032
2017-09-03 12:25:32,676 WARN  [main] mapreduce.JobResourceUploader (JobResourceUploader.java:uploadFiles(64))
- Hadoop command-line option parsing not performed. Implement the Tool interface and execute
your application with ToolRunner to remedy this.
2017-09-03 12:25:32,916 INFO  [main] db.DBInputFormat (DBInputFormat.java:setTxIsolation(192))
- Using read commited transaction isolation
2017-09-03 12:25:32,917 INFO  [main] db.DataDrivenDBInputFormat (DataDrivenDBInputFormat.java:getSplits(147))
- BoundingValsQuery: SELECT MIN("l_orderkey"), MAX("l_orderkey") FROM "lineitem"
2017-09-03 12:25:32,947 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(198))
- number of splits:4
2017-09-03 12:25:33,022 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(287))
- Submitting tokens for job: job_1504463611328_0004
2017-09-03 12:25:33,291 INFO  [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(273))
- Submitted application application_1504463611328_0004
2017-09-03 12:25:33,334 INFO  [main] mapreduce.Job (Job.java:submit(1294)) - The url to track
the job: http://Bharaths-MacBook-Pro.local:8088/proxy/application_1504463611328_0004/
2017-09-03 12:25:33,335 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1339)) - Running
job: job_1504463611328_0004
2017-09-03 12:25:39,430 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1360)) - Job
job_1504463611328_0004 running in uber mode : false
2017-09-03 12:25:39,431 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - 
map 0% reduce 0%
2017-09-03 12:25:43,479 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - 
map 25% reduce 0%
2017-09-03 12:25:45,495 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - 
map 50% reduce 0%
2017-09-03 12:25:46,504 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - 
map 75% reduce 0%
2017-09-03 12:25:47,513 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - 
map 100% reduce 0%
2017-09-03 12:25:47,520 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1378)) - Job
job_1504463611328_0004 completed successfully
2017-09-03 12:25:47,597 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1385)) - Counters:
30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=486880
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=485
		HDFS: Number of bytes written=8508828
		HDFS: Number of read operations=16
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=8
	Job Counters 
		Launched map tasks=4
		Other local map tasks=4
		Total time spent by all maps in occupied slots (ms)=12656
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=12656
		Total vcore-milliseconds taken by all map tasks=12656
		Total megabyte-milliseconds taken by all map tasks=12959744
	Map-Reduce Framework
		Map input records=60175
		Map output records=60175
		Input split bytes=485
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=204
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=553648128
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=8508828
2017-09-03 12:25:47,602 INFO  [main] mapreduce.ImportJobBase (ImportJobBase.java:runJob(184))
- Transferred 8.1147 MB in 15.3528 seconds (541.2293 KB/sec)
2017-09-03 12:25:47,604 INFO  [main] mapreduce.ImportJobBase (ImportJobBase.java:runJob(186))
- Retrieved 60175 records.
2017-09-03 12:25:47,611 INFO  [main] manager.SqlManager (SqlManager.java:execute(757)) - Executing
SQL statement: SELECT t.* FROM "lineitem" AS t LIMIT 1
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_quantity had to be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_extendedprice had to be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_discount had to be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_tax had to be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_shipdate had to be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_commitdate had to be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN  [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188))
- Column l_receiptdate had to be cast to a less precise type in Hive
2017-09-03 12:25:47,637 INFO  [main] hive.HiveImport (HiveImport.java:importTable(194)) -
Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-common-1.2.1.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:file:/user/hive/warehouse/lineitem
is not a directory or unable to create one)
Disconnected from the target VM, address: '127.0.0.1:64590', transport: 'socket'
Exception in thread "main" java.lang.RuntimeException: Sqoop execution failure. Return code
: 1
	at sqoopexperiments.executeSqoop(sqoopexperiments.java:30)
	at sqoopexperiments.main(sqoopexperiments.java:44)

Process finished with exit code 1
{code}

Notice that everything works perfectly fine until the "Loading uploaded data into Hive" stage.
Stepping through the call stack using debugger, I discovered the reason for the failure. The
problem is in "executeScript" method of org/apache/sqoop/hive/HiveImport.class. When the control
reaches this point, the temporary hive script contains this line of contents exactly as expected:

{code}
CREATE TABLE IF NOT EXISTS `lineitem` ( `l_orderkey` INT, `l_partkey` INT, `l_suppkey` INT,
`l_linenumber` INT, `l_quantity` DOUBLE, `l_extendedprice` DOUBLE, `l_discount` DOUBLE, `l_tax`
DOUBLE, `l_returnflag` STRING, `l_linestatus` STRING, `l_shipdate` STRING, `l_commitdate`
STRING, `l_receiptdate` STRING, `l_shipinstruct` STRING, `l_shipmode` STRING, `l_comment`
STRING) COMMENT 'Imported by sqoop on 2017/09/03 12:25:47' ROW FORMAT DELIMITED FIELDS TERMINATED
BY '\054' LINES TERMINATED BY '\012' STORED AS TEXTFILE;
LOAD DATA INPATH 'hdfs://localhost:9000/user/unifi/tmp/testsqoop/1' INTO TABLE `lineitem`;
{code}

There is this block of code which determines how to execute this temporary hive script in
executeScript method of HiveImport.class file:

{code}
try {
                Class ite = Class.forName("org.apache.hadoop.hive.cli.CliDriver");
                LOG.debug("Using in-process Hive instance.");
                subprocessSM = new SubprocessSecurityManager();
                subprocessSM.install();
                String[] cause1 = new String[]{"-f", filename};
                Method ese1 = ite.getMethod("main", new Class[]{cause1.getClass()});
                ese1.invoke((Object)null, new Object[]{cause1});
            } catch (ClassNotFoundException var14) {
                LOG.debug("Using external Hive process.");
                this.executeExternalHiveScript(filename, env);
            }
{code}

If Hive CLI driver related jars are in my classpath, the program tries to invoke this line
"ese1.invoke((Object)null, new Object[]{cause1});" and after this stage it looses all the
hadoop configuration context (hdfs-site, yarn-site, mapred-site, hive-site configs) it carried
until this far and resorts to a default configuration because of this code in /usr/local/Cellar/hive/1.2.1/libexec/lib/hive-cli-1.2.1.jar!/org/apache/hadoop/hive/cli/CliDriver.class:

{code}
    public CliDriver() {
        SessionState ss = SessionState.get();
        this.conf = (Configuration)(ss != null?ss.getConf():new Configuration());
        Log LOG = LogFactory.getLog("CliDriver");
        if(LOG.isDebugEnabled()) {
            LOG.debug("CliDriver inited with classpath " + System.getProperty("java.class.path"));
        }

        this.console = new LogHelper(LOG);
    }
{code}

I don't fully understand what this "SessionState" is and why it is null here. This causes
a new Configuration() to be generated which results in all my hadoop configuration being lost
and hence it looks for the "/user/hive/warehouse/lineitem" directory on my local file system
instead of hdfs.

If I remove the "hive-cli-1.2.1.jar" from my classpath, the HiveImport program takes the route
of executing hive script using the hive binary on my system and in this mode the hive table
gets created properly:

{code}
    private void executeExternalHiveScript(String filename, List<String> env) throws
IOException {
        String hiveExec = this.getHiveBinPath();
        ArrayList args = new ArrayList();
        args.add(hiveExec);
        args.add("-f");
        args.add(filename);
        LoggingAsyncSink logSink = new LoggingAsyncSink(LOG);
        int ret = Executor.exec((String[])args.toArray(new String[0]), (String[])env.toArray(new
String[0]), logSink, logSink);
        if(0 != ret) {
            throw new IOException("Hive exited with status " + ret);
        }
    }
{code}

My intention is to execute sqoop import to hive from java by prepackaging all the necessary
hadoop jars without the need for hadoop binaries (hdfs, mapred, hive etc) to be present on
my system. Stepping though the debugger, it appears to me that there is some bug which is
causing all the hadoop configs to be lost when sqoop reaches Hive execution stage and executes
via org.apache.hadoop.hive.cli.CliDriver.

Hoping that somebody has attempted to do hive imports via sqoop in this fashion and figured
out a solution or a workaround.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message