spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: Constraint Solver for Spark
Date Wed, 11 Jun 2014 07:21:29 GMT
For explicit feedback, ALS uses only observed ratings for computation.
So XtXs are not the same. -Xiangrui

On Tue, Jun 10, 2014 at 8:58 PM, Debasish Das <debasish.das83@gmail.com> wrote:
> Sorry last one went out by mistake:
>
> Is not for users (0 to numUsers), fullXtX is same ? In the ALS formulation
> this is W^TW or H^TH which should be same for all the users ? Why we are
> reading userXtX(index) and adding it to fullXtX in the loop over all
> numUsers ?
>
> // Solve the least-squares problem for each user and return the new feature
> vectors
>
>     Array.range(0, numUsers).map { index =>
>
>       // Compute the full XtX matrix from the lower-triangular part we got
> above
>
>       fillFullMatrix(userXtX(index), fullXtX)
>
>       // Add regularization
>
>       var i = 0
>
>       while (i < rank) {
>
>         fullXtX.data(i * rank + i) += lambda
>
>         i += 1
>
>       }
>
>       // Solve the resulting matrix, which is symmetric and
> positive-definite
>
>       algo match {
>
>         case ALSAlgo.Implicit =>
> Solve.solvePositive(fullXtX.addi(YtY.get.value),
> userXy(index)).data
>
>         case ALSAlgo.Explicit => Solve.solvePositive(fullXtX, userXy
> (index)).data
>
>       }
>
>     }
>
>
> On Tue, Jun 10, 2014 at 8:56 PM, Debasish Das <debasish.das83@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am bit confused wiht the code here:
>>
>> // Solve the least-squares problem for each user and return the new
>> feature vectors
>>
>>     Array.range(0, numUsers).map { index =>
>>
>>       // Compute the full XtX matrix from the lower-triangular part we
>> got above
>>
>>       fillFullMatrix(userXtX(index), fullXtX)
>>
>>       // Add regularization
>>
>>       var i = 0
>>
>>       while (i < rank) {
>>
>>         fullXtX.data(i * rank + i) += lambda
>>
>>         i += 1
>>
>>       }
>>
>>       // Solve the resulting matrix, which is symmetric and
>> positive-definite
>>
>>       algo match {
>>
>>         case ALSAlgo.Implicit => Solve.solvePositive(fullXtX.addi(YtY.get.value),
>> userXy(index)).data
>>
>>         case ALSAlgo.Explicit => Solve.solvePositive(fullXtX, userXy
>> (index)).data
>>
>>       }
>>
>>     }
>>
>>
>> On Fri, Jun 6, 2014 at 10:42 AM, Debasish Das <debasish.das83@gmail.com>
>> wrote:
>>
>>> Hi Xiangrui,
>>>
>>> It's not the linear constraint, It is quadratic inequality with slack,
>>> first order taylor approximation of off diagonal cross terms and a cyclic
>>> coordinate descent, which we think will yield orthogonality....It's still
>>> under works...
>>>
>>> Also we want to put a L1 constraint as set of linear equations when
>>> solving for ALS...
>>>
>>> I will create the JIRA...as I see it, this will evolve to a generic
>>> constraint solver for machine learning problems that has a QP
>>> structure....ALS is one example....another example is kernel SVMs...
>>>
>>> I did not know that lgpl solver can be added to the classpath....if it
>>> can be then definitely we should add these in ALS.scala...
>>>
>>> Thanks.
>>> Deb
>>>
>>>
>>>
>>> On Thu, Jun 5, 2014 at 11:31 PM, Xiangrui Meng <mengxr@gmail.com> wrote:
>>>
>>>> I don't quite understand why putting linear constraints can promote
>>>> orthogonality. For the interfaces, if the subproblem is determined by
>>>> Y^T Y and Y^T b for each iteration, then the least squares solver, the
>>>> non-negative least squares solver, or your convex solver is simply a
>>>> function
>>>>
>>>> (A, b) -> x.
>>>>
>>>> You can define it as an interface, and make the solver pluggable by
>>>> adding a setter to ALS. If you want to use your lgpl solver, just
>>>> include it in the classpath. Creating two separate files still seems
>>>> unnecessary to me. Could you create a JIRA and we can move our
>>>> discussion there? Thanks!
>>>>
>>>> Best,
>>>> Xiangrui
>>>>
>>>> On Thu, Jun 5, 2014 at 7:20 PM, Debasish Das <debasish.das83@gmail.com>
>>>> wrote:
>>>> > Hi Xiangrui,
>>>> >
>>>> > For orthogonality properties in the factors we need a constraint solver
>>>> > other than the usuals (l1, upper and lower bounds, l2 etc)
>>>> >
>>>> > The interface of constraint solver is standard and I can add it in
>>>> mllib
>>>> > optimization....
>>>> >
>>>> > But I am not sure how will I call the gpl licensed ipm solver from
>>>> > mllib....assume the solver interface is as follows:
>>>> >
>>>> > Qpsolver (densematrix h, array [double] f, int linearEquality, int
>>>> > linearInequality, bool lb, bool ub)
>>>> >
>>>> > And then I have functions to update equalities, inequalities, bounds
>>>> etc
>>>> > followed by the run which generates the solution....
>>>> >
>>>> > For l1 constraints I have to use epigraph formulation which needs a
>>>> > variable transformation before the solve....
>>>> >
>>>> > I was thinking that for the problems that does not need constraints
>>>> people
>>>> > will use ALS.scala and ConstrainedALS.scala will have the constrained
>>>> > formulations....
>>>> >
>>>> > I can point you to the code once it is ready and then you can guide
me
>>>> how
>>>> > to refactor it to mllib als ?
>>>> >
>>>> > Thanks.
>>>> > Deb
>>>> > Hi Deb,
>>>> >
>>>> > Why do you want to make those methods public? If you only need to
>>>> > replace the solver for subproblems. You can try to make the solver
>>>> > pluggable. Now it supports least squares and non-negative least
>>>> > squares. You can define an interface for the subproblem solvers and
>>>> > maintain the IPM solver at your own code base, if the only information
>>>> > you need is Y^T Y and Y^T b.
>>>> >
>>>> > Btw, just curious, what is the use case for quadratic constraints?
>>>> >
>>>> > Best,
>>>> > Xiangrui
>>>> >
>>>> > On Thu, Jun 5, 2014 at 3:38 PM, Debasish Das <debasish.das83@gmail.com
>>>> >
>>>> > wrote:
>>>> >> Hi,
>>>> >>
>>>> >> We are adding a constrained ALS solver in Spark to solve matrix
>>>> >> factorization use-cases which needs additional constraints (bounds,
>>>> >> equality, inequality, quadratic constraints)
>>>> >>
>>>> >> We are using a native version of a primal dual SOCP solver due to
its
>>>> > small
>>>> >> memory footprint and sparse ccs matrix computation it uses...The
>>>> solver
>>>> >> depends on AMD and LDL packages from Timothy Davis for sparse ccs
>>>> matrix
>>>> >> algebra (released under lgpl)...
>>>> >>
>>>> >> Due to GPL dependencies, it won't be possible to release the code
as
>>>> > Apache
>>>> >> license for now...If we get good results on our use-cases, we will
>>>> plan to
>>>> >> write a version in breeze/modify joptimizer for sparse ccs
>>>> operations...
>>>> >>
>>>> >> I derived ConstrainedALS from Spark mllib ALS and I am comparing
the
>>>> >> performance with default ALS and non-negative ALS as baseline. Plan
>>>> is to
>>>> >> release the code as GPL license for community review...I have kept
the
>>>> >> package structure as org.apache.spark.mllib.recommendation
>>>> >>
>>>> >> There are some private functions defined in ALS, which I would like
to
>>>> >> reuse....Is it possible to take the private out from the following
>>>> >> functions:
>>>> >>
>>>> >> 1. makeLinkRDDs
>>>> >> 2. makeInLinkBlock
>>>> >> 3. makeOutLinkBlock
>>>> >> 4. randomFactor
>>>> >> 5. unblockFactors
>>>> >>
>>>> >> I don't want to copy any code.... I can ask for a PR to make these
>>>> >> changes...
>>>> >>
>>>> >> Thanks.
>>>> >> Deb
>>>>
>>>
>>>
>>

Mime
View raw message