mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kamal Ali <k...@grokker.com>
Subject Re: factorize-movielens-1M.sh privilegedActionException: reports dir doesn't exist when it does exist
Date Sun, 20 Jan 2013 22:17:20 GMT
THANKS sebastian.



On Sat, Jan 19, 2013 at 2:02 AM, Sebastian Schelter <ssc@apache.org> wrote:

> Kamal,
>
> factorize-movielens-1M.sh is a small example that invokes HADOOP locally
> to showcase how ALS factorizes a small dataset sitting in the local
> filesystem.
>
> parallelALS runs on cluster, that is its main purpose. Just invoke it
> via mahout parallelALS or directly via hadoop jar ... just like any
> other Hadoop job. Be aware that its slow as it needs a lot of iterations
> which impose a lot of overhead on Hadoop.
>
> If your data is small, then you can also use an SVDRecommender with
> ALSWRFactorizer to factorize your data in-memory on a single machine.
>
> /s
>
> On 19.01.2013 01:35, Kamal Ali wrote:
> > thanks sebastian, the key was this line:
> > export MAHOUT_LOCAL=true
> >
> > upon your advice, i just set that in my bash and everything worked.
> THANKS!
> > the main thing in factorize*sh is invocation of parallelALS which i had
> > hoped
> > would run over mahout when not running just in local.
> >
> > is it well known that parallelALS doesnt run over a distributed hadoop
> > cluster
> > or am i misunderstanding what MAHOUT_LOCAL=true does ?
> > [its name "parallel" made me think it ran on a hadoop cluster]
> > if you know of a way of making parallelALS run on a true cluster, if you
> > could send me the link(s) , i would really appreciate it.
> > thanks,
> > kamal.
> >
> > http://svn.apache.org/repos/asf/mahout/trunk/bin/mahout
> > says:
> >
> > #   MAHOUT_LOCAL       set to anything other than an empty string to
> force
> > #                      mahout to run locally even if
> > #                      HADOOP_CONF_DIR and HADOOP_HOME are set
> >
> >
> >
> >
> >
> > On Fri, Jan 18, 2013 at 3:43 PM, Sebastian Schelter <ssc@apache.org>
> wrote:
> >
> >> The example should work, I tested it yesterday. The simplest way to
> >> execute it is to first build mahout using
> >>
> >> $ mvn -DskipTests clean install
> >>
> >> Then download the movielens1M dataset from
> >> http://www.grouplens.org/node/73 and unzip it.
> >>
> >> After that, go to examples/bin and point the script to the ratings.dat
> >> file found in the movielens dataset.
> >>
> >> $ export MAHOUT_LOCAL=true
> >> $ bash factorize-movielens-1M.sh /path/to/ratings.dat
> >>
> >> Best,
> >> Sebastian
> >>
> >>
> >> On 19.01.2013 00:20, Kamal Ali wrote:
> >>> I'm a newbie trying to get some mahout commandline examples to work.
> >>>
> >>> I tried executing factorize-movielens-1M.sh but  get an error "input
> path
> >>> does not exist: /tmp/mahout-work-kali/movielens/ratings.csv"
> >>> even after i manually created /tmp/mahout-work-ali/ and all its
> >> descendant
> >>> directories and chmod'd them to 777.
> >>>
> >>> even after i modified factorize-movielens-1M.sh to do a "ls -l " on the
> >>> ratings.csv which show /tmp/mahout-work-kali/movielens/ratings.csv
> >>>  exists.
> >>>
> >>> [the input file u1.base already has "::" instead of \t as delimiters.]
> >>>
> >>> i'm wondering if the error is something else and is being mis-reported
> >> and
> >>> some intermediate script/program is just getting a non-zero
> >>> return status and falling back on a stock error message.
> >>>
> >>> i am on 64bit mac, jdk1.7. my ssh keys were generated using user
> "kali".
> >>>
> >>> has anyone had success running factorize-movielens-1M.sh ?
> >>>
> >>> does this factorize*sh only run in mahout local mode ?
> >>>
> >>> is factorize-movielens-1M.sh cruddy and old and some other way
> >>> should be used??
> >>>
> >>> i'm primarily interested in getting ALS methods to work,
> >>> if someone knows where in the mahout distribution one can find the
> >>> latest or most tested ALS implementation (and the maven command to run
> >> it)
> >>> pls let me know .
> >>>
> >>> THANK YOU!
> >>> kamal.
> >>>
> >>> my hadoop-env.sh is at the end of this email.
> >>> ================================================
> >>> ./factorize-movielens-1M.sh     $grouplens/ml-100k/u1.base   #
> grouplens
> >>> points to a directory containing the file u1.base
> >>> creating work directory at /tmp/mahout-work-kali
> >>> kamal: doing ls -l on movie lens dir:
> >>> total 1544
> >>> drwxrwxrwx  3 kali  wheel     102 Jan 18 12:20 dataset
> >>> -rwxrwxrwx  1 kali  wheel  786544 Jan 18 13:46 ratings.csv
> >>> kamal: doing wc -l on ratings.csv
> >>>    80000 /tmp/mahout-work-kali/movielens/ratings.csv
> >>> Converting ratings...
> >>> after sed
> >>> -rwxrwxrwx  1 kali  wheel  786544 Jan 18 13:47
> >>> /tmp/mahout-work-kali/movielens/ratings.csv
> >>> kamal: doing head on ratings.csv
> >>> 1,1,5
> >>> 1,2,3
> >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and
> >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf
> >>> MAHOUT-JOB:
> >>>
> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> 13/01/18 13:47:24 INFO common.AbstractJob: Command line arguments:
> >>> {--endPhase=[2147483647],
> >>> --input=[/tmp/mahout-work-kali/movielens/ratings.csv],
> >>> --output=[/tmp/mahout-work-kali/dataset], --probePercentage=[0.1],
> >>> --startPhase=[0], --tempDir=[/tmp/mahout-work-kali/dataset/tmp],
> >>> --trainingPercentage=[0.9]}
> >>> 2013-01-18 13:47:24.918 java[53562:1703] Unable to load realm info from
> >>> SCDynamicStore
> >>> 13/01/18 13:47:25 INFO mapred.JobClient: Cleaning up the staging area
> >>>
> >>
> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0035
> >>> 13/01/18 13:47:25 ERROR security.UserGroupInformation:
> >>> PriviledgedActionException as:kali
> >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input
> >>> path does not exist: /tmp/mahout-work-kali/movielens/ratings.csv
> >>> Exception in thread "main"
> >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> >>> does not exist: /tmp/mahout-work-kali/movielens/ratings.csv
> >>> at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> >>>  at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
> >>> at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
> >>>  at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
> >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
> >>>  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
> >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> >>>  at java.security.AccessController.doPrivileged(Native Method)
> >>> at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>  at
> >>>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> >>> at
> >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
> >>>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> >>>  at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.run(DatasetSplitter.java:90)
> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>> at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.main(DatasetSplitter.java:64)
> >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>> at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>  at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> at java.lang.reflect.Method.invoke(Method.java:601)
> >>>  at
> >>>
> >>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >>>  at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>  at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>> at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>  at java.lang.reflect.Method.invoke(Method.java:601)
> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >>> after splitDataset
> >>> -rwxrwxrwx  1 kali  wheel  786544 Jan 18 13:47
> >>> /tmp/mahout-work-kali/movielens/ratings.csv
> >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and
> >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf
> >>> MAHOUT-JOB:
> >>>
> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> 13/01/18 13:47:31 INFO common.AbstractJob: Command line arguments:
> >>> {--alpha=[40], --endPhase=[2147483647], --implicitFeedback=[false],
> >>> --input=[/tmp/mahout-work-kali/dataset/trainingSet/], --lambda=[0.065],
> >>> --numFeatures=[20], --numIterations=[10],
> >>> --output=[/tmp/mahout-work-kali/als/out], --startPhase=[0],
> >>> --tempDir=[/tmp/mahout-work-kali/als/tmp]}
> >>> 2013-01-18 13:47:31.259 java[53605:1703] Unable to load realm info from
> >>> SCDynamicStore
> >>> 13/01/18 13:47:32 INFO mapred.JobClient: Cleaning up the staging area
> >>>
> >>
> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0036
> >>> 13/01/18 13:47:32 ERROR security.UserGroupInformation:
> >>> PriviledgedActionException as:kali
> >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input
> >>> path does not exist: /tmp/mahout-work-kali/dataset/trainingSet
> >>> Exception in thread "main"
> >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> >>> does not exist: /tmp/mahout-work-kali/dataset/trainingSet
> >>> at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> >>>  at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
> >>> at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
> >>>  at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
> >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
> >>>  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
> >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> >>>  at java.security.AccessController.doPrivileged(Native Method)
> >>> at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>  at
> >>>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> >>> at
> >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
> >>>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> >>>  at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.run(ParallelALSFactorizationJob.java:137)
> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>> at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.main(ParallelALSFactorizationJob.java:98)
> >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>> at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>  at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> at java.lang.reflect.Method.invoke(Method.java:601)
> >>>  at
> >>>
> >>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >>>  at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>  at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>> at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>  at java.lang.reflect.Method.invoke(Method.java:601)
> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and
> >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf
> >>> MAHOUT-JOB:
> >>>
> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> 13/01/18 13:47:38 INFO common.AbstractJob: Command line arguments:
> >>> {--endPhase=[2147483647],
> >>> --input=[/tmp/mahout-work-kali/dataset/probeSet/],
> >>> --itemFeatures=[/tmp/mahout-work-kali/als/out/M/],
> >>> --output=[/tmp/mahout-work-kali/als/rmse/], --startPhase=[0],
> >>> --tempDir=[/tmp/mahout-work-kali/als/tmp],
> >>> --userFeatures=[/tmp/mahout-work-kali/als/out/U/]}
> >>> 2013-01-18 13:47:38.142 java[53645:1703] Unable to load realm info from
> >>> SCDynamicStore
> >>> 13/01/18 13:47:38 INFO mapred.JobClient: Cleaning up the staging area
> >>>
> >>
> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0037
> >>> 13/01/18 13:47:38 ERROR security.UserGroupInformation:
> >>> PriviledgedActionException as:kali
> >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input
> >>> path does not exist: /tmp/mahout-work-kali/dataset/probeSet
> >>> Exception in thread "main"
> >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> >>> does not exist: /tmp/mahout-work-kali/dataset/probeSet
> >>> at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> >>>  at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
> >>> at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
> >>>  at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
> >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
> >>>  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
> >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> >>>  at java.security.AccessController.doPrivileged(Native Method)
> >>> at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>  at
> >>>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> >>> at
> >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
> >>>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> >>>  at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.FactorizationEvaluator.run(FactorizationEvaluator.java:91)
> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>> at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.FactorizationEvaluator.main(FactorizationEvaluator.java:68)
> >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>> at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>  at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> at java.lang.reflect.Method.invoke(Method.java:601)
> >>>  at
> >>>
> >>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >>>  at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>  at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>> at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>  at java.lang.reflect.Method.invoke(Method.java:601)
> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and
> >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf
> >>> MAHOUT-JOB:
> >>>
> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar
> >>> Warning: $HADOOP_HOME is deprecated.
> >>>
> >>> 13/01/18 13:47:44 INFO common.AbstractJob: Command line arguments:
> >>> {--endPhase=[2147483647],
> >>> --input=[/tmp/mahout-work-kali/als/out/userRatings/],
> >>> --itemFeatures=[/tmp/mahout-work-kali/als/out/M/], --maxRating=[5],
> >>> --numRecommendations=[6],
> >>> --output=[/tmp/mahout-work-kali/recommendations/], --startPhase=[0],
> >>> --tempDir=[temp], --userFeatures=[/tmp/mahout-work-kali/als/out/U/]}
> >>> 2013-01-18 13:47:44.859 java[53687:1703] Unable to load realm info from
> >>> SCDynamicStore
> >>> 13/01/18 13:47:45 INFO mapred.JobClient: Cleaning up the staging area
> >>>
> >>
> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0038
> >>> 13/01/18 13:47:45 ERROR security.UserGroupInformation:
> >>> PriviledgedActionException as:kali
> >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
> Input
> >>> path does not exist: /tmp/mahout-work-kali/als/out/userRatings
> >>> Exception in thread "main"
> >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
> >>> does not exist: /tmp/mahout-work-kali/als/out/userRatings
> >>> at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
> >>>  at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
> >>> at
> >>>
> >>
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
> >>>  at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
> >>> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
> >>>  at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
> >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
> >>>  at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> >>> at java.security.AccessController.doPrivileged(Native Method)
> >>>  at javax.security.auth.Subject.doAs(Subject.java:415)
> >>> at
> >>>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> >>>  at
> >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
> >>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> >>>  at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> >>> at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.RecommenderJob.run(RecommenderJob.java:95)
> >>>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>>  at
> >>>
> >>
> org.apache.mahout.cf.taste.hadoop.als.RecommenderJob.main(RecommenderJob.java:69)
> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>  at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>> at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>>  at java.lang.reflect.Method.invoke(Method.java:601)
> >>> at
> >>>
> >>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >>>  at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> >>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>> at
> >>>
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>>  at
> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> at java.lang.reflect.Method.invoke(Method.java:601)
> >>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >>>
> >>> RMSE is:
> >>>
> >>> cat: /tmp/mahout-work-kali/als/rmse/rmse.txt: No such file or directory
> >>>
> >>>
> >>>
> >>> Sample recommendations:
> >>>
> >>> cat: /tmp/mahout-work-kali/recommendations/part-m-00000: No such file
> or
> >>> directory
> >>>
> >>>
> >>> ==================================================
> >>> # Set Hadoop-specific environment variables here.
> >>>
> >>> # The only required environment variable is JAVA_HOME.  All others are
> >>> # optional.  When running a distributed configuration it is best to
> >>> # set JAVA_HOME in this file, so that it is correctly defined on
> >>> # remote nodes.
> >>>
> >>> # The java implementation to use.  Required.
> >>> export
> >>>
> >>
> JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_10.jdk/Contents/Home/jre
> >>>
> >>> # Extra Java CLASSPATH elements.  Optional.
> >>> # export HADOOP_CLASSPATH=
> >>>
> >>> # The maximum amount of heap to use, in MB. Default is 1000.
> >>> # export HADOOP_HEAPSIZE=2000
> >>>
> >>> # Extra Java runtime options.  Empty by default.
> >>> # export HADOOP_OPTS=-server
> >>>
> >>> # Command specific options appended to HADOOP_OPTS when specified
> >>> export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote
> >>> $HADOOP_NAMENODE_OPTS"
> >>> export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
> >>> $HADOOP_SECONDARYNAMENODE_OPTS"
> >>> export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
> >>> $HADOOP_DATANODE_OPTS"
> >>> export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote
> >>> $HADOOP_BALANCER_OPTS"
> >>> export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote
> >>> $HADOOP_JOBTRACKER_OPTS"
> >>> # export HADOOP_TASKTRACKER_OPTS=
> >>> # The following applies to multiple commands (fs, dfs, fsck, distcp
> etc)
> >>> # export HADOOP_CLIENT_OPTS
> >>>
> >>> # Extra ssh options.  Empty by default.
> >>> # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o
> SendEnv=HADOOP_CONF_DIR"
> >>>
> >>> # Where log files are stored.  $HADOOP_HOME/logs by default.
> >>> # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
> >>>
> >>> # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
> >>> # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
> >>>
> >>> # host:path where hadoop code should be rsync'd from.  Unset by
> default.
> >>> # export HADOOP_MASTER=master:/home/$USER/src/hadoop
> >>>
> >>> # Seconds to sleep between slave commands.  Unset by default.  This
> >>> # can be useful in large clusters, where, e.g., slave rsyncs can
> >>> # otherwise arrive faster than the master can service them.
> >>> # export HADOOP_SLAVE_SLEEP=0.1
> >>>
> >>> # The directory where pid files are stored. /tmp by default.
> >>> # export HADOOP_PID_DIR=/var/hadoop/pids
> >>>
> >>> # A string representing this instance of hadoop. $USER by default.
> >>> # export HADOOP_IDENT_STRING=$USER
> >>>
> >>> # The scheduling priority for daemon processes.  See 'man nice'.
> >>> # export HADOOP_NICENESS=10
> >>>
> >>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message