sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Munjuluri, Shyam" <Munjulu...@AETNA.com>
Subject RE: Sqoop import to Hive - batch restart guideline
Date Mon, 28 Jul 2014 14:53:20 GMT
I use "Cron" to schedule data extracts. It is a very simple mechanism to schedule jobs via
Linux. Refer any online good documentation on LINIX / CRON for details.

For incrementals, I have a shell script that loops through each of the tables (table names
listed in a text file along with LAST_UPDATE_DTS). If a particular table extract fail, the
LAST_UPDATE_DTS remains as is so that the next time the job is run, it pickups from where
it left. For all the successful extracts, the LAST_UPDATE_DATE is advanced so that the subsequent
extracts do not pick up old data.

You can try "Oozie' but it has to be 1 job for each table extract and you have to code it
in XML. I felt that it is little bit of overkill especially when there are not a lot of dependencies
to be set between jobs. In may be of better use in your case especially if Oozie allows us
to try multiple times in case of a failure.

Hope this helps.


Shyam Munjuluri
Integrated Systems Engineering
Office: (M, W, F) 860-273-0595
WAH:  (Tu, Th)  860-404-5219

From: Sethuramaswamy, Suresh [mailto:suresh.sethuramaswamy@credit-suisse.com]
Sent: Monday, July 28, 2014 10:32 AM
To: user@sqoop.apache.org
Subject: Sqoop import to Hive - batch restart guideline


We wanted to schedule daily incremental import jobs to hive  tables using sqoop , reading
data from Oracle.

40 + tables are involved in refresh, i'm looking for some guidelines or a best practice implementation
in such cases , Ex. How to design the restart mechanism, if 11th table refresh is failed in
a offline batch mode.


Please access the attached hyperlink for an important electronic communications disclaimer:

This e-mail may contain confidential or privileged information. If
you think you have received this e-mail in error, please advise the
sender by reply e-mail and then delete this e-mail immediately.
Thank you. Aetna   
View raw message