spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From redocpot <>
Subject Re: implicit ALS dataSet
Date Thu, 05 Jun 2014 21:38:28 GMT
Thank you for your quick reply.

As far as I know, the update does not require negative observations, because
the update rule

Xu = (YtCuY + λI)^-1 Yt Cu P(u)

can be simplified by taking advantage of its algebraic structure, so
negative observations are not needed. This is what I think at the first time
I read the paper.

What makes me confused is, after that, the paper (in Discussion section)

"Unlike explicit datasets, here *the model should take all user-item
preferences as an input, including those which are not related to any input
observation (thus hinting to a zero preference).* This is crucial, as the
given observations are inherently biased towards a positive preference, and
thus do not reflect well the user profile. 
However, taking all user-item values as an input to the model raises serious
scalability issues – the number of all those pairs tends to significantly
exceed the input size since a typical user would provide feedback only on a
small fraction of the available items. We address this by exploiting the
algebraic structure of the model, leading to an algorithm that scales
linearly with the input size *while addressing the full scope of user-item
pairs* without resorting to any sub-sampling."

If my understanding is right, it seems that we need negative obs as input,
but we dont use them during the updating. It is strange for me, because that
will generate too many use-time pair, which is not possible.

Thx for the confirmation. I will read the ALS implementation for more


View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message