sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raj Hadoop <hadoop...@yahoo.com>
Subject Oracle to HDFS through Sqoop and a Hive External Table
Date Sun, 03 Nov 2013 15:39:55 GMT

I am sending this to the three dist-lists of Hadoop, Hive and Sqoop as this question is closely
related to all the three areas.

I have this requirement.

I have a big table in Oracle (about 60 million rows - Primary Key Customer Id). I want to
bring this to HDFS and then create
a Hive external table. My requirement is running queries on this Hive table (at this time
i do not know what queries i would be running).

Is the following a good design for the above problem ? Any pros and cons of this.

1) Load the table to HDFS using Sqoop into multiple folders (divide Customer Id's into 100
2) Create Hive external partition table based on the above 100 HDFS directories.

View raw message