spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <>
Subject mllib.recommendation Design
Date Fri, 13 Feb 2015 15:46:41 GMT

I am bit confused on the mllib design in the master. I thought that core
algorithms will stay in mllib and ml will define the pipelines over the
core algorithm but looks like in master ALS is moved from mllib to ml...

I am refactoring my PR to a factorization package and I want to build it on
top of ml.recommendation.ALS (possibly extend from ml.recommendation.ALS
since first version will use very similar RDD handling as ALS and a
proximal solver that's being added to breeze)

Basically I am not sure if we should merge it with recommendation.ALS since
this is more generic than recommendation. I am considering calling it
ConstrainedALS where user can specify different constraint for user and
product factors (Similar to GraphLab CF structure).

I am also working on ConstrainedALM where the underlying algorithm is no
longer ALS but nonlinear alternating minimization with constraints.
This will let us do large rank matrix completion where there is no need to
construct gram matrices. I will open up the JIRA soon after getting initial

I am bit confused that where should I add the factorization package. It
will use the current ALS test-cases and I have to construct more test-cases
for sparse coding and PLSA formulations.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message