sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Munjuluri, Shyam" <Munjulu...@AETNA.com>
Subject RE: Sqoop import to Hive - batch restart guideline
Date Mon, 28 Jul 2014 18:35:41 GMT
You can check the status of SQOOP run immediately after the SQOOP import statement by doing
the following.

Sqoop_result=$?

If [ $sqoop_result -eq 0 ];  then
Echo "sqoop import successful"
Else
Echo "sqoop import failed"
Fi

Hope that helps.

Shyam Munjuluri
Integrated Systems Engineering
Office: (M, W, F) 860-273-0595
WAH:  (Tu, Th)  860-404-5219

From: Sethuramaswamy, Suresh [mailto:suresh.sethuramaswamy@credit-suisse.com]
Sent: Monday, July 28, 2014 11:11 AM
To: user@sqoop.apache.org
Subject: RE: Sqoop import to Hive - batch restart guideline

Thanks Shyam,

How do you control LAST_UPDATE_DTS on the text file to be updated only when the sqoop import
completes successfully.

Suresh

From: Munjuluri, Shyam [mailto:MunjuluriS@AETNA.com]
Sent: Monday, July 28, 2014 10:53 AM
To: user@sqoop.apache.org<mailto:user@sqoop.apache.org>
Subject: RE: Sqoop import to Hive - batch restart guideline

I use "Cron" to schedule data extracts. It is a very simple mechanism to schedule jobs via
Linux. Refer any online good documentation on LINIX / CRON for details.

For incrementals, I have a shell script that loops through each of the tables (table names
listed in a text file along with LAST_UPDATE_DTS). If a particular table extract fail, the
LAST_UPDATE_DTS remains as is so that the next time the job is run, it pickups from where
it left. For all the successful extracts, the LAST_UPDATE_DATE is advanced so that the subsequent
extracts do not pick up old data.

You can try "Oozie' but it has to be 1 job for each table extract and you have to code it
in XML. I felt that it is little bit of overkill especially when there are not a lot of dependencies
to be set between jobs. In may be of better use in your case especially if Oozie allows us
to try multiple times in case of a failure.

Hope this helps.

Thanks,


Shyam Munjuluri
Integrated Systems Engineering
Office: (M, W, F) 860-273-0595
WAH:  (Tu, Th)  860-404-5219

From: Sethuramaswamy, Suresh [mailto:suresh.sethuramaswamy@credit-suisse.com]
Sent: Monday, July 28, 2014 10:32 AM
To: user@sqoop.apache.org<mailto:user@sqoop.apache.org>
Subject: Sqoop import to Hive - batch restart guideline

Experts,

We wanted to schedule daily incremental import jobs to hive  tables using sqoop , reading
data from Oracle.

40 + tables are involved in refresh, i'm looking for some guidelines or a best practice implementation
in such cases , Ex. How to design the restart mechanism, if 11th table refresh is failed in
a offline batch mode.

Regards,
Suresh

==============================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==============================================================================

This e-mail may contain confidential or privileged information. If you think you have received
this e-mail in error, please advise the sender by reply e-mail and then delete this e-mail
immediately. Thank you. Aetna



==============================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==============================================================================



This e-mail may contain confidential or privileged information. If
you think you have received this e-mail in error, please advise the
sender by reply e-mail and then delete this e-mail immediately.
Thank you. Aetna   
Mime
View raw message