sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikash Talanki -X (vtalanki - INFOSYS LIMITED at Cisco)" <vtala...@cisco.com>
Subject RE: How to specify --target-dir for sqoop incremental imports into hive
Date Mon, 09 Jun 2014 16:27:34 GMT
Hi Jarcec,

I am trying to import incremental data from Oracle to hive but not HDFS. Though I have not
specified --hive-import parameter in below command, I am getting the same issue even when
I use it.
The reason I provided --target-dir is that currently the user with which I am running sqoop
command is a sudo user and it has no permissions to write or create anything in its home directory(which
sqoop uses as default directory to import data).
So, please let me know how does it work in case of loading incremental data into hive.
Do we need to provide --target-dir? If yes, what value it should be? Hive warehouse location?


Thanks,
Vikash Talanki
+1 (408)838-4078

-----Original Message-----
From: Jarek Jarcec Cecho [mailto:jarcec@apache.org] 
Sent: Monday, June 09, 2014 7:38 AM
To: user@sqoop.apache.org
Subject: Re: How to specify --target-dir for sqoop incremental imports into hive

Did you actually tried to let Sqoop finish it's job?

I believe that you are observing valid behaviour - MapReduce won't allow you to import data
into existing directory and hence Sqoop will firstly import data into temporal directory and
then move them to final destination specified with --target-dir argument.

Jarcec

On Mon, Jun 09, 2014 at 05:50:55AM +0000, Vikash Talanki -X (vtalanki - INFOSYS LIMITED at
Cisco) wrote:
> Hi All,
> 
> 
> I want to use existing sqoop incremental parameters to load data from oracle to hive.
> 
> Here is the sqoop command :
> 
> sqoop import -D mapred.child.java.opts='\-Djava.security.egd=file:/dev/../dev/urandom'
--connect 'jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(FAILOVER=on)(LOAD_BALANCE=on)(ADDRESS=(PROTOCOL=TCP)(HOST=XXXXXX)(PORT=1530))(ADDRESS=(PROTOCOL=TCP)(HOST=XXXXX)(PORT=1530)))(CONNECT_DATA=(SERVER=dedicated)(SERVICE_NAME=XXXXXX)))'
--username XXXXX --password XXXXXX -m 1 --table XXCSS_KTN_REQ_LINE_DETAIL --target-dir /app/SmartAnalytics/Apps/frameworks_dataingestion.db/xxcss_ktn_req_line_detail_vtest
--hive-table frameworks_dataingestion.XXCSS_KTN_REQ_LINE_DETAIL_vtest --map-column-hive LINE_ITEM_ID=BIGINT,LIST_PRICE=BIGINT,SERVICE_VALUE=BIGINT
--null-string '\\N' --null-non-string '\\N' --hive-delims-replacement ' ' --check-column LID_DATE
--incremental append --last-value '2014-05-27 10:38:17.0'
> 
> Even when I specify the target directory of my existing tables HDFS file location it
is still creating a different output directory -
> 14/06/08 21:28:52 INFO mapred.JobClient: Creating job's output directory at _sqoop/08212846713XXCSS_KTN_REQ_LINE_DETAIL
> 
> Why is this happening? What needs to be provided for -target-dir?
> Thanks in advance.
> 
> [Description: http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
> 
> Vikash Talanki
> Engineer - Software
> vtalanki@cisco.com
> Phone: +1 (408)838 4078
> 
> Cisco Systems Limited
> SJ-J 3
> 255 W Tasman Dr
> San Jose
> CA - 95134
> United States
> Cisco.com<http://www.cisco.com/>
> 
> 
> 
> 
> 
> [Description: Think before you print.]Think before you print.
> 
> This email may contain confidential and privileged material for the sole use of the intended
recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If
you are not the intended recipient (or authorized to receive for the recipient), please contact
the sender by reply email and delete all copies of this message.
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html
> 
> 
> 




Mime
View raw message