hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Reiter <a.rei...@web.de>
Subject Re: Running MapReduce from a web application
Date Fri, 24 Jun 2011 21:47:20 GMT
Hi Doug,

thanks a lot for your reply
the point is clear hoe to create a job instance and to configure it using the TableMapReduceUtil.initTableMapperJob
actually our job is working just perfectly, even the third party libs are simple to import
using TableMapReduceUtil.addDependencyJars

the problem is about the starting the MR job...

at the moment we do it this way:
  - set HADOOP_CLASSPATH with hbase, zookeeper, and all third party jars
  - execute "./bin/hadoop jar /tmp/map_reduce_v1.jar package1.MRDriver1"

that works like a charm, the question is now, how to start the job from our web application
running on tomcat ???

one option is may be to fork a new process, like this:
ProcessBuilder pb = new ProcessBuilder("/opt/hadoop/bin/hadoop", "jar", "/tmp/map_reduce_v1.jar",
// configure ProcessBuilder
Process p = processBuilder.start();

but this does not seem to be very elegant to us... does it?

so how to start a job from a running app, in the same process without forking


Doug Meil wrote:
> Hi there-
> Take a look at this for starters...
> http://hbase.apache.org/book.html#mapreduce
> if you do job.waitForCompletion(true); it will execute synchronously.  If you do job.waitForCompletion(false)
it will fire and forget.  A simple pattern is to spin off a thread where it executes job.waitFor..(true)
and then you can pick up the results.
> -----Original Message-----
> From: Andre Reiter [mailto:a.reiter@web.de]
> Sent: Friday, June 24, 2011 12:41 AM
> To: user@hbase.apache.org
> Subject: Re: Running MapReduce from a web application
> Hi everybody,
> no suggestiona about that questions?
> how to submit a MR out of my application, and not manually from a shell useing ./bin/hadoop
jar ... ?
> best regards
> andre
> Andre Reiter wrote:
>> now i would like to start MR jobs from my web application running on a tomcat, is
there an elegant way to do it?
>> the second question: at the moment i use the TextOutputFormatis the
>> output format, which creates a file in the specified dfs directory:
>> part-r-00000 so i can read id using ./bin/hadoop fs -cat
>> /tmp/requests/part-r-00000 on the shell
>> how can i get the path to this output file after my job is finished, to process it
however... is there another way to collect results of a MR job, a text file is good for humans,
but IMHO parsing a text file for results is not the preferable way...
>> thanks in advance
>> andre
>> PS:
>> versions:
>> - Linux version 2.6.26-2-amd64 (Debian 2.6.26-25lenny1)
>> - hadoop-0.20.2-CDH3B4
>> - hbase-0.90.1-CDH3B4
>> - zookeeper-3.3.2-CDH3B4

View raw message