spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] [Issue Comment Deleted] (SPARK-1262) Numerical drift in computation of matrix inverse leads to invalid results in ALS
Date Wed, 21 Jan 2015 06:58:35 GMT


Sean Owen updated SPARK-1262:
    Comment: was deleted

(was: That makes sense. There may still be an interesting point here. What would be interested
to know is whether the size of |Ax-B| is consistently larger or smaller for one solver or
another. If you find the QR decomposition has smaller differences and gives anecdotally better
results, that would be a compelling reason to consider the different decomposition. There
was another issue regarding solvePositive and mysteriously near-singular A that might be related.)

> Numerical drift in computation of matrix inverse leads to invalid results in ALS
> --------------------------------------------------------------------------------
>                 Key: SPARK-1262
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>            Reporter: Michael Allman
> In what follows I cannot offer an expert analysis and remedy, however I will describe
the problem I'm seeing and the strategy I've used to mitigate it.
> The ALS {{updateBlock()}} method includes a call to {{Solve.solvePositive()}} from JBlas.
Generally speaking, when we call {{solvePositive(A, B)}} for symmetric, positive definite
{{A}}, this method computes {{x}} from the matrix equation {{Ax = B}}. Or, in other words,
it computes {{A}}^-1^{{B}}. As mentioned, one of the preconditions on A is that it be symmetric.
In ALS, we call this method on {{fullXtX}} or {{fullXtX.add(YtY.value.get)}}, both of which
should be symmetric. However, for implicit ALS and rank > 1, some kind of imprecision in
the computation of this inverse tends to make successive values of {{fullXtX.add(YtY.value.get)}}
less and less symmetric. From my experience this can and does alter the model produced by
ALS significantly, leading to very different and counterintuitive user-item recommendations
compared to a solution in which this asymmetry does not exist.
> An approach I've seen taken against this problem from the Oryx codebase is to cast each
value in the vector returned from {{Solve.solvePositive()}} to a float. That has worked for
> I've also tried alternative implementations of a linear system solver. The solver that
Oryx uses, {{RRQRDecomposition}} from commons math v3, seems to lead to better solutions,
but still drifts unless the results are cast to floats.
> The Colt numerical computing libraries include a solver called {{QRDecomposition}}, which
may be superior to the one used in JBlas. However, I haven't tested it.
> I'm working on a unit test to exercise this bug. I discovered it by "tracing" the behavior
of {{ALS.scala}} with various logging statements inserted into the code.
> If nothing else, add a line before the calls to {{Solve.solvePositive()}} in {{updateBlock()}}
to validate that {{fullXtX}} or {{fullXtX.add(YtY.value.get)}} are symmetric.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message