sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sonya Ling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-931) Integrate HCatalog with Sqoop
Date Tue, 20 Aug 2013 01:30:51 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744599#comment-13744599

Sonya Ling commented on SQOOP-931:

I got Hadoop 2.0.0-cdh4.3.0 work with sqoop-1.4.4 (HCatlog integration).  It populates records
with dynamic partitions beautifully. 

Building hcatalog with 2.0 artifact yourself is not a good idea because you might get other
error like'Caused by: java.lang.ClassNotFoundException: org.apache.hcatalog.shims.HCatHadoopShims23'
error (not to say you need to manipulate maven to get maven build successfully).  Instead,
you should get hcatalog-cdh4 (the same version as your Hadoop, Hive etc.).  That ensures everything
is compatible.

You need to manually create your partitioned table either in hcat or hive beforehand since
--create-hcatalog-table won't create partitioned table for you.  Then, execute sqoop script
like the following example:

sqoop import --connect jdbc:mysql://<host>/<database> --username <user>
--password <password> --table <sql-table> --where <where clause> --split-by
<split-field> --hcatalog-database <hcat-database> --hcatalog-table <hcat-table>

The important thing is NOT TO put -hive-partition-key.  That's for static partition, stated
clearly in document. Dynamic partions would work like charm.

Thanks for all the helps.

> Integrate HCatalog with Sqoop
> -----------------------------
>                 Key: SQOOP-931
>                 URL: https://issues.apache.org/jira/browse/SQOOP-931
>             Project: Sqoop
>          Issue Type: New Feature
>    Affects Versions: 1.4.2, 1.4.3
>         Environment: All 1.x sqoop version
>            Reporter: Venkat Ranganathan
>            Assignee: Venkat Ranganathan
>             Fix For: 1.4.4
>         Attachments: SQOOP-931.patch, SQOOP-931.patch.14, SQOOP HCatalog Integration
- 2.pdf, SQOOP HCatalog Integration - 3.pdf, SQOOP HCatalog Integration.pdf
>  Apache HCatalog is a table and storage management service that provides a shared schema,
data types and table abstraction freeing users from being concerned about where or how their
data is stored.  It provides interoperability across  Pig, Map Reduce, and Hive.
> A sqoop hcatalog connector will help in supporting storage formats that are abstracted
by HCatalog.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message