spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <debasish.da...@gmail.com>
Subject Re: Constraint Solver for Spark
Date Wed, 11 Jun 2014 03:58:24 GMT
Sorry last one went out by mistake:

Is not for users (0 to numUsers), fullXtX is same ? In the ALS formulation
this is W^TW or H^TH which should be same for all the users ? Why we are
reading userXtX(index) and adding it to fullXtX in the loop over all
numUsers ?

// Solve the least-squares problem for each user and return the new feature
vectors

    Array.range(0, numUsers).map { index =>

      // Compute the full XtX matrix from the lower-triangular part we got
above

      fillFullMatrix(userXtX(index), fullXtX)

      // Add regularization

      var i = 0

      while (i < rank) {

        fullXtX.data(i * rank + i) += lambda

        i += 1

      }

      // Solve the resulting matrix, which is symmetric and
positive-definite

      algo match {

        case ALSAlgo.Implicit =>
Solve.solvePositive(fullXtX.addi(YtY.get.value),
userXy(index)).data

        case ALSAlgo.Explicit => Solve.solvePositive(fullXtX, userXy
(index)).data

      }

    }


On Tue, Jun 10, 2014 at 8:56 PM, Debasish Das <debasish.das83@gmail.com>
wrote:

> Hi,
>
> I am bit confused wiht the code here:
>
> // Solve the least-squares problem for each user and return the new
> feature vectors
>
>     Array.range(0, numUsers).map { index =>
>
>       // Compute the full XtX matrix from the lower-triangular part we
> got above
>
>       fillFullMatrix(userXtX(index), fullXtX)
>
>       // Add regularization
>
>       var i = 0
>
>       while (i < rank) {
>
>         fullXtX.data(i * rank + i) += lambda
>
>         i += 1
>
>       }
>
>       // Solve the resulting matrix, which is symmetric and
> positive-definite
>
>       algo match {
>
>         case ALSAlgo.Implicit => Solve.solvePositive(fullXtX.addi(YtY.get.value),
> userXy(index)).data
>
>         case ALSAlgo.Explicit => Solve.solvePositive(fullXtX, userXy
> (index)).data
>
>       }
>
>     }
>
>
> On Fri, Jun 6, 2014 at 10:42 AM, Debasish Das <debasish.das83@gmail.com>
> wrote:
>
>> Hi Xiangrui,
>>
>> It's not the linear constraint, It is quadratic inequality with slack,
>> first order taylor approximation of off diagonal cross terms and a cyclic
>> coordinate descent, which we think will yield orthogonality....It's still
>> under works...
>>
>> Also we want to put a L1 constraint as set of linear equations when
>> solving for ALS...
>>
>> I will create the JIRA...as I see it, this will evolve to a generic
>> constraint solver for machine learning problems that has a QP
>> structure....ALS is one example....another example is kernel SVMs...
>>
>> I did not know that lgpl solver can be added to the classpath....if it
>> can be then definitely we should add these in ALS.scala...
>>
>> Thanks.
>> Deb
>>
>>
>>
>> On Thu, Jun 5, 2014 at 11:31 PM, Xiangrui Meng <mengxr@gmail.com> wrote:
>>
>>> I don't quite understand why putting linear constraints can promote
>>> orthogonality. For the interfaces, if the subproblem is determined by
>>> Y^T Y and Y^T b for each iteration, then the least squares solver, the
>>> non-negative least squares solver, or your convex solver is simply a
>>> function
>>>
>>> (A, b) -> x.
>>>
>>> You can define it as an interface, and make the solver pluggable by
>>> adding a setter to ALS. If you want to use your lgpl solver, just
>>> include it in the classpath. Creating two separate files still seems
>>> unnecessary to me. Could you create a JIRA and we can move our
>>> discussion there? Thanks!
>>>
>>> Best,
>>> Xiangrui
>>>
>>> On Thu, Jun 5, 2014 at 7:20 PM, Debasish Das <debasish.das83@gmail.com>
>>> wrote:
>>> > Hi Xiangrui,
>>> >
>>> > For orthogonality properties in the factors we need a constraint solver
>>> > other than the usuals (l1, upper and lower bounds, l2 etc)
>>> >
>>> > The interface of constraint solver is standard and I can add it in
>>> mllib
>>> > optimization....
>>> >
>>> > But I am not sure how will I call the gpl licensed ipm solver from
>>> > mllib....assume the solver interface is as follows:
>>> >
>>> > Qpsolver (densematrix h, array [double] f, int linearEquality, int
>>> > linearInequality, bool lb, bool ub)
>>> >
>>> > And then I have functions to update equalities, inequalities, bounds
>>> etc
>>> > followed by the run which generates the solution....
>>> >
>>> > For l1 constraints I have to use epigraph formulation which needs a
>>> > variable transformation before the solve....
>>> >
>>> > I was thinking that for the problems that does not need constraints
>>> people
>>> > will use ALS.scala and ConstrainedALS.scala will have the constrained
>>> > formulations....
>>> >
>>> > I can point you to the code once it is ready and then you can guide me
>>> how
>>> > to refactor it to mllib als ?
>>> >
>>> > Thanks.
>>> > Deb
>>> > Hi Deb,
>>> >
>>> > Why do you want to make those methods public? If you only need to
>>> > replace the solver for subproblems. You can try to make the solver
>>> > pluggable. Now it supports least squares and non-negative least
>>> > squares. You can define an interface for the subproblem solvers and
>>> > maintain the IPM solver at your own code base, if the only information
>>> > you need is Y^T Y and Y^T b.
>>> >
>>> > Btw, just curious, what is the use case for quadratic constraints?
>>> >
>>> > Best,
>>> > Xiangrui
>>> >
>>> > On Thu, Jun 5, 2014 at 3:38 PM, Debasish Das <debasish.das83@gmail.com
>>> >
>>> > wrote:
>>> >> Hi,
>>> >>
>>> >> We are adding a constrained ALS solver in Spark to solve matrix
>>> >> factorization use-cases which needs additional constraints (bounds,
>>> >> equality, inequality, quadratic constraints)
>>> >>
>>> >> We are using a native version of a primal dual SOCP solver due to its
>>> > small
>>> >> memory footprint and sparse ccs matrix computation it uses...The
>>> solver
>>> >> depends on AMD and LDL packages from Timothy Davis for sparse ccs
>>> matrix
>>> >> algebra (released under lgpl)...
>>> >>
>>> >> Due to GPL dependencies, it won't be possible to release the code as
>>> > Apache
>>> >> license for now...If we get good results on our use-cases, we will
>>> plan to
>>> >> write a version in breeze/modify joptimizer for sparse ccs
>>> operations...
>>> >>
>>> >> I derived ConstrainedALS from Spark mllib ALS and I am comparing the
>>> >> performance with default ALS and non-negative ALS as baseline. Plan
>>> is to
>>> >> release the code as GPL license for community review...I have kept the
>>> >> package structure as org.apache.spark.mllib.recommendation
>>> >>
>>> >> There are some private functions defined in ALS, which I would like
to
>>> >> reuse....Is it possible to take the private out from the following
>>> >> functions:
>>> >>
>>> >> 1. makeLinkRDDs
>>> >> 2. makeInLinkBlock
>>> >> 3. makeOutLinkBlock
>>> >> 4. randomFactor
>>> >> 5. unblockFactors
>>> >>
>>> >> I don't want to copy any code.... I can ask for a PR to make these
>>> >> changes...
>>> >>
>>> >> Thanks.
>>> >> Deb
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message