mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora Skarra-Gallagher <aur...@yahoo-inc.com>
Subject Re: Running Taste Web example without the webserver
Date Wed, 22 Jul 2009 23:53:45 GMT
Hi,

Is anyone able to point me in the right direction on this?

Thanks,
Aurora


On 7/21/09 4:12 PM, "Aurora Skarra-gallagher" <aurora@yahoo-inc.com> wrote:

Hi,

I apologize if I've misunderstood the purpose of the Taste component of Mahout. Our goal was
to take a recommendation framework and use our own recommendation algorithm within it. We
need to process a massive amount of data, and wanted it to be done on our Hadoop grid. I thought
that Taste was the right fit for the job. I'm not interested in the HTTP service. I'm interested
in the recommendation framework, particularly from a back-end batch perspective. Does that
help clarify? Thanks for helping me sort through this.

-Aurora


On 7/21/09 3:02 PM, "Sean Owen" <srowen@gmail.com> wrote:

Hmm, lots going on here, it's confusing.

Are you trying to run this on Hadoop intentionally? because the web
app example is not intended to run on Hadoop. It's a component
intended to serve recommendations over HTTP in real time. It also
appears you are running an evaluation rather than a web app serving
requests. I realize you're trying to run this without Jetty, but
that's kind of like trying to run a web app without a web server.

I think you'd have to clarify what you are trying to do, and then what
you are doing right now, to begin to assist.

On Tue, Jul 21, 2009 at 9:20 PM, Aurora
Skarra-Gallagher<aurora@yahoo-inc.com> wrote:
> Hi,
>
> I'm trying to run the taste web example without using jetty. Our gateways aren't meant
to be used as webservers. By poking around, I found that the following command worked:
> hadoop --config ~/hod-clusters/test jar /x/mahout-current/examples/target/mahout-examples-0.2-SNAPSHOT.job
org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner
>
> The output is:
> 09/07/21 19:59:21 INFO file.FileDataModel: Creating FileDataModel for file /tmp/ratings.txt
> 09/07/21 19:59:21 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation
using 0.9 of GroupLensDataModel
> 09/07/21 19:59:22 INFO file.FileDataModel: Reading file info...
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 100000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 200000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 300000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 400000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 500000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 600000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 700000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 800000 lines
> 09/07/21 19:59:23 INFO file.FileDataModel: Processed 900000 lines
> 09/07/21 19:59:23 INFO file.FileDataModel: Processed 1000000 lines
> 09/07/21 19:59:23 INFO file.FileDataModel: Read lines: 1000209
> 09/07/21 19:59:30 INFO slopeone.MemoryDiffStorage: Building average diffs...
> 09/07/21 19:59:42 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result:
0.7035965559003973
> 09/07/21 19:59:42 INFO grouplens.GroupLensRecommenderEvaluatorRunner: 0.7035965559003973
>
> The job appears to write data to /tmp/ratings.txt and /tmp/movies.txt. I'm not sure if
this is the correct way to run this example. I have a few questions:
>
>  1.  Is the output file /tmp/ratings.txt? If so, how do I interpret it?
>  2.  What does the Evaluation result mean?
>  3.  Is it even running on HDFS?
>  4.  Is it a map-reduce job?
>
> Any pointers on how to run this as a standalone job would be helpful.
>
> Thanks,
> Aurora
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message