sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Calling Sqoop incremental job from Map/Reduce code
Date Tue, 12 Feb 2013 15:44:09 GMT
Hi Artem,
would you mind describe your use case in a more details? I'm especially interested to know
more about what do you mean by executing from map/reduce program.

Sqoop itself will span a mapreduce job, so executing it from another map/reduce job do not
make much sense as you would get exponencial load. Imagine 50 mappers where each will span
Sqoop job that will again span 50 mappers, thats 50 * 50 = 2500 running map tasks that most
likely would kill your remote database. Thus it might be more appropriate to execute Sqoop
prior running your mapreduce job as you've mentioned that you're already doing.

About your question whether Sqoop needs to be installed on each node, it do not. Hadoop is
providing facility called DistributedCache [1] that allows you to distribute arbitrary files
with your job. The benefit is that jars will be automatically added to application classpath.

Jarcec

Links:
1: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html

On Tue, Feb 12, 2013 at 03:26:50PM +0000, Artem Ervits wrote:
> Hello all,
> 
> I'd like to know if there's a way to execute an incremental job from a map/reduce program.
If there is a way, please point to a user guide I can take a look at to achieve it. In case
it is possible, does Sqoop need to be installed on every node of the Hadoop cluster? I'm aware
of the fact that Oozie would be able to achieve this but I was wondering if there are other
ways. Right now I have a script that first calls the Sqoop job and then executes the M/R job.
> 
> Thank you.
> 
> Artem Ervits
> New York Presbyterian Hospital
> 
> 
> 
> --------------------
> 
> This electronic message is intended to be for the use only of the named recipient, and
may contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
> 
> 
> 
> 
> --------------------
> 
> This electronic message is intended to be for the use only of the named recipient, and
may contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
> 
> 
> 

Mime
View raw message