sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sonya Ling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-931) Integrate HCatalog with Sqoop
Date Tue, 20 Aug 2013 01:30:51 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744599#comment-13744599
] 

Sonya Ling commented on SQOOP-931:
----------------------------------

I got Hadoop 2.0.0-cdh4.3.0 work with sqoop-1.4.4 (HCatlog integration).  It populates records
with dynamic partitions beautifully. 

Building hcatalog with 2.0 artifact yourself is not a good idea because you might get other
error like'Caused by: java.lang.ClassNotFoundException: org.apache.hcatalog.shims.HCatHadoopShims23'
error (not to say you need to manipulate maven to get maven build successfully).  Instead,
you should get hcatalog-cdh4 (the same version as your Hadoop, Hive etc.).  That ensures everything
is compatible.

You need to manually create your partitioned table either in hcat or hive beforehand since
--create-hcatalog-table won't create partitioned table for you.  Then, execute sqoop script
like the following example:

sqoop import --connect jdbc:mysql://<host>/<database> --username <user>
--password <password> --table <sql-table> --where <where clause> --split-by
<split-field> --hcatalog-database <hcat-database> --hcatalog-table <hcat-table>

The important thing is NOT TO put -hive-partition-key.  That's for static partition, stated
clearly in document. Dynamic partions would work like charm.

Thanks for all the helps.
Cheers.

 
                
> Integrate HCatalog with Sqoop
> -----------------------------
>
>                 Key: SQOOP-931
>                 URL: https://issues.apache.org/jira/browse/SQOOP-931
>             Project: Sqoop
>          Issue Type: New Feature
>    Affects Versions: 1.4.2, 1.4.3
>         Environment: All 1.x sqoop version
>            Reporter: Venkat Ranganathan
>            Assignee: Venkat Ranganathan
>             Fix For: 1.4.4
>
>         Attachments: SQOOP-931.patch, SQOOP-931.patch.14, SQOOP HCatalog Integration
- 2.pdf, SQOOP HCatalog Integration - 3.pdf, SQOOP HCatalog Integration.pdf
>
>
>  Apache HCatalog is a table and storage management service that provides a shared schema,
data types and table abstraction freeing users from being concerned about where or how their
data is stored.  It provides interoperability across  Pig, Map Reduce, and Hive.
> A sqoop hcatalog connector will help in supporting storage formats that are abstracted
by HCatalog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message