mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Map reduce job for Recommender
Date Tue, 14 Jun 2011 18:06:40 GMT
On Tue, Jun 14, 2011 at 11:57 AM, Prashant Sharma
<> wrote:
> 1. Is there a difference between
>  and
> apart from one being
> for fully distributed and other for psuedo distributed mode. As one has an
> implementation of recommendor class to run and the other similarity.

Yes that's the difference. The "pseudo-distributed" version isn't
really a distributed algorithm. It's just splitting work among n
non-distributed instances. Which could still be useful.

> 2. There is no documentation given about the inbuilt similarity classes, can
> you suggest me some reading which gives detail about the implementation of
> those classes, also an example on how to write on of our own would be very
> helpful.

The book does a great job of exploring these differences, if I do say
so myself! If you have more specific questions, you can ask here. The
implementation is open source and in most cases a pretty
straightforward implementation of the definition of various similarity
metrics, which you can look up on Wikipedia.

Sebastian -- Chapter 6 of the book never *quite* covered the actual
distributed computaiton in Mahout. It is even too complex for one
chapter of a book. It explains a somewhat simplified version of the
computation as an intro to Mahout on Hadoop. In my opinion once you
understand the simplified outline, what's in Mahout now is fairly
clear from the docs.

In any event, it's already more or less "to press" now.

(We'll see: I can already smell a 2nd edition...)

View raw message