sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Shanbhag <ankur_shanb...@persistent.co.in>
Subject Support for multiple partitions on HDFS using single Sqoop import
Date Fri, 21 Feb 2014 13:36:18 GMT
Hello,

We can import data into a partition using Sqoop hive-import by specifying values for --hive-partition-key
and --hive-partition-value. But, at a time only one partition is created using Sqoop import
command.

Is there any way where multiple partition values Or range of values can be specified in one
Sqoop Job?

Sample command to import data into partition using Sqoop:-
sqoop import --connect jdbc:oracle:thin:@//ps8606:1521/ORCL --query "select  ROLL_NO,NAME,DOB
from QAUSER.STUDENT_DEMO where DOB ='1991-08-21' and \$CONDITIONS " --target-dir "QAUSER.STUDENT_DEMO"
--hive-import --hive-overwrite --hive-table "QAUSER.STUDENT" --hive-partition-key "DOB" --hive-partition-value
"1991-08-21" --split-by ROLL_NO --username QAUSER --password qauser

The above command will create a partition that contains all records of students whose DOB
is 21-Aug-1991. For this partition, a sub-directory with name as "DOB=1991-08-21" gets created
inside student directory.

For different 'DOB' values, can above Sqoop command be modified to create multiple partitions
on HDFS?

Thanks and Regards,
Ankur Shanbhag

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent
Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed.
If you are not the intended recipient, you are not authorized to read, retain, copy, print,
distribute or use this message. If you have received this communication in error, please notify
the sender and delete all copies of this message. Persistent Systems Ltd. does not accept
any liability for virus infected mails.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message