mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Taste-GenericItemBasedRecommender
Date Sat, 12 Dec 2009 19:48:23 GMT
On Sat, Dec 12, 2009 at 8:58 AM, Sean Owen <> wrote:

> ...
> I think that's the culprit in fact, having to load all the column
> vectors, since they're not light.

If the vector matrix product is done like this:

      Vector w =;
      MultiplyAdd scale = new MultiplyAdd();
      while (v.iterateNonZero().hasNext()) {
        Vector.Element element = v.iterateNonZero().next();
        w.assign(getColumn(element.index()), scale);

Then you might have a better speed, especially since you can lazy load just
the columns you want.  Google collections has an interesting dynamic map
builder that would be useful for this.

> One approach is to make the user vectors more sparse by throwing out
> data, though I don't like it so much.

This is often useful actually, but more in the sense of only retaining
recent events than sparsification.  If you get lots of data per user, then
this isn't much of a problem.  If you use rarer data, then you may have more
of an issue (ratings are the prime example).

> One question -- in SparseVector, can't we internally remove entries
> when they are set to 0.0? since implicitly missing entries are 0?
Absolutely.  Depending on representation, deletion of elements may not help
so much unless subsequent elements are added that fill in the holes.

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message