spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <debasish.da...@gmail.com>
Subject mllib.recommendation Design
Date Fri, 13 Feb 2015 15:46:41 GMT
Hi,

I am bit confused on the mllib design in the master. I thought that core
algorithms will stay in mllib and ml will define the pipelines over the
core algorithm but looks like in master ALS is moved from mllib to ml...

I am refactoring my PR to a factorization package and I want to build it on
top of ml.recommendation.ALS (possibly extend from ml.recommendation.ALS
since first version will use very similar RDD handling as ALS and a
proximal solver that's being added to breeze)

https://issues.apache.org/jira/browse/SPARK-2426
https://github.com/scalanlp/breeze/pull/321

Basically I am not sure if we should merge it with recommendation.ALS since
this is more generic than recommendation. I am considering calling it
ConstrainedALS where user can specify different constraint for user and
product factors (Similar to GraphLab CF structure).

I am also working on ConstrainedALM where the underlying algorithm is no
longer ALS but nonlinear alternating minimization with constraints.
https://github.com/scalanlp/breeze/pull/364
This will let us do large rank matrix completion where there is no need to
construct gram matrices. I will open up the JIRA soon after getting initial
results

I am bit confused that where should I add the factorization package. It
will use the current ALS test-cases and I have to construct more test-cases
for sparse coding and PLSA formulations.

Thanks.
Deb

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message