mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: Submitting mahout jobs to map/reduce cluster with fair scheduling
Date Fri, 09 Nov 2012 01:11:24 GMT
That Job extends org.apache.mahout.common.AbstractJob, so it probably 
will accept a -D argument to set "mapred.fairscheduler.pool=..." . Have 
you tried this?


On 11/8/12 3:41 PM, Yazan Boshmaf wrote:
> Hello,
>
> I'm trying to run the ASF Email example here:
> https://cwiki.apache.org/confluence/display/MAHOUT/ASFEmail
>
> I am using an existing Hive/Hadoop cluster.
>
> When I run:
>
> $MAHOUT_HOME/bin/mahout
> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>
> I get:
>
> MAHOUT-JOB:
> /usr/local/mahout-0.8/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
> 12/11/08 12:13:54 WARN driver.MahoutDriver: No
> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.props found on
> classpath, will use command-line arguments only
> 12/11/08 12:13:54 INFO kmeans.Job: Running with default arguments
> 12/11/08 12:13:55 INFO FileSystem.collect: makeAbsolute: output working
> directory: hdfs://my_cluster:my_port/
> 12/11/08 12:13:55 INFO kmeans.Job: Preparing Input
> 12/11/08 12:13:55 INFO FileSystem.collect: make Qualify non absolute path:
> testdata working directory: dfs://cluster:port_num/
> 12/11/08 12:13:55 INFO corona.SessionDriver: My serverSocketPort port_num
> 12/11/08 12:13:55 INFO corona.SessionDriver: My Address ip_addrs:port_num
> 12/11/08 12:13:55 INFO corona.SessionDriver: Connecting to cluster manager
> at data_manager:port_num
> 12/11/08 12:13:55 INFO corona.SessionDriver: Got session ID
> 201211051809.387193
> 12/11/08 12:13:55 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 12/11/08 12:13:56 INFO FileSystem.collect: makeAbsolute: output/data
> working directory: dfs://cluster:port_num/
> 12/11/08 12:13:56 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/11/08 12:13:56 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 12/11/08 12:13:56 INFO lzo.LzoCodec: Successfully loaded & initialized
> native-lzo library [hadoop-lzo rev fatal: Not a git repository (or any of
> the parent directories): .git]
> 12/11/08 12:13:57 ERROR mapred.CoronaJobTracker: UNCAUGHT: Thread main got
> an uncaught exception
> java.io.IOException: InvalidSessionHandle(handle:This cluster is operating
> in configured pools only mode.  The pool group and pool was specified as
> 'default.defaultpool' and is not part of this cluster.  Please use the
> Corona parameter mapred.fairscheduler.pool to set a valid pool group and
> pool in the format <poolgroup>.<pool>)
> at
> org.apache.hadoop.corona.SessionDriver.startSession(SessionDriver.java:275)
> at
> org.apache.hadoop.mapred.CoronaJobTracker.startFullTracker(CoronaJobTracker.java:670)
> at
> org.apache.hadoop.mapred.CoronaJobTracker.submitJob(CoronaJobTracker.java:1898)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:1259)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:459)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:474)
> at
> org.apache.mahout.clustering.conversion.InputDriver.runJob(InputDriver.java:108)
> at
> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.run(Job.java:129)
> at
> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.main(Job.java:59)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> My question is: How do I configure Mahout to use pools? That is, where do I
> set the Corona "mapred.fairscheduler.pool" JobConf?
>


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message