spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: implicit ALS dataSet
Date Thu, 05 Jun 2014 16:44:37 GMT
The paper definitely does not suggest that you should include every
user-item pair in the input. The input is by nature extremely sparse,
so literally filling in all the 0s in the input would create
overwhelmingly large input. No, there is no need to do it and it would
be terrible for performance.

As far as I can see, the implementation would correctly handle an
input of 0 and the result would be as if it had not been included at
all, but, that is to say that you do not include implicit 0 input, no.

That's not quite negative input, either figuratively or literally. Are
you trying to figure out how to include actual negative feedback (i.e.
a signal that a user actively does not like an item)? That you do
include if you like, and the implementation is extended from the
original paper to meaningfully handle negative values.

On Thu, Jun 5, 2014 at 4:46 PM, redocpot <> wrote:
> Hi,
> According to the paper on which MLlib's ALS is based, the model should take
> all user-item preferences
> as an input, including those which are not related to any input observation
> (zero preference).
> My question is:
> With all positive observations in hand (similar to explicit feedback data
> set), should I generate all negative observations in order to make implicit
> ALS work with the complete data set (pos union neg) ?
> Actually, we test on some data set like:
> | user | item | nbPurchase |
> nbPurchase is non zero, so we have no negative observations. What we did is
> generating all possible user-item with zero nbPurchase to have all possible
> user-item pair, but this operation takes some time and storage.
> I just want to make sure whether we have to do that with MLlib's ALS ? or it
> has already done that ? In that case, I could simply pass only the positive
> observation as the explicit ALS does.
> Hao.
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at

View raw message