mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: RecommenderJob uses indirection for ItemIDs
Date Sun, 12 Jun 2011 10:43:04 GMT
The keys have to be hashed to be used as int offsets into a vector. While
loading the mapping isn't ideal it does only scale as the number of items
and users.
 On Jun 12, 2011 3:47 AM, "Lance Norskog" <> wrote:
> The RecommenderJob makes a "side" file which maps a fabricated integer
> index to a long ItemID. Why is this needed? Couldn't the
> RecommenderJob propagate the long ItemID directly? Note that this
> forces all instances of AggregateAndReduceRecommender to load the
> entire map. Part of the Map/Reduce rules are 'nothing needs to know
> everything'.
> Is this a sparse/dense optimization? If so, have the distributed
> algorithms advanced enough to make this indirection unnecessary?
> --
> Lance Norskog

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message