spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iman Mohtashemi <iman.mohtash...@gmail.com>
Subject Re: TallSkinnyQR
Date Fri, 02 Dec 2016 19:07:34 GMT
Great thanks! Where can I get the latest with the bug fixes?
best regards,
Iman

On Fri, Dec 2, 2016 at 10:54 AM Huamin Li <3ericli@gmail.com> wrote:

> Hi,
>
> There seems to be a bug in the section of code that converts the RowMatrix
> format back into indexedRowMatrix format.
>
> For RowMatrix, I think the singular values and right singular vectors
> (not the left singular vectors U) that computeSVD computes are correct when
> using multiple executors/machines; Only the R (not the Q) in tallSkinnyQR
> is correct when using multiple executors/machines. U and Q were being
> stored in RowMatrix format. There is no index information about RowMatrix,
> so it does not make sense for U and Q.
>
> Others have run into this same problem (
> https://issues.apache.org/jira/browse/SPARK-8614)
>
> I think the quick solution for this problem is copy and paste the multiply,
> computeSVD, and tallSkinnyQR code from RowMatrix to IndexedRowMatrix and
> make the corresponding changes although this would result in code
> duplication.
>
> I have fixed the problem by what I mentioned above. Now, multiply,
> computeSVD, and tallSkinnyQR are giving the correct results for
> indexedRowMatrix when using multiple executors or workers. Let me know if
> I should do a pull request for this.
>
> Best,
> Huamin
>
> On Fri, Dec 2, 2016 at 11:23 AM, Iman Mohtashemi <
> iman.mohtashemi@gmail.com> wrote:
>
> Ok thanks.
>
> On Fri, Dec 2, 2016 at 8:19 AM Sean Owen <sowen@cloudera.com> wrote:
>
> I tried, but enforcing the ordering changed a fair bit of behavior and I
> gave up. I think the way to think of it is: a RowMatrix has whatever
> ordering you made it with, so you need to give it ordered rows if you're
> going to use a method like the QR decomposition. That works. I don't think
> the QR method should ever have been on this class though, for this reason.
>
> On Fri, Dec 2, 2016 at 4:13 PM Iman Mohtashemi <iman.mohtashemi@gmail.com>
> wrote:
>
> Hi guys,
> Was this bug ever resolved?
> Iman
>
> On Fri, Nov 11, 2016 at 9:59 AM Iman Mohtashemi <iman.mohtashemi@gmail.com>
> wrote:
>
> Yes this would be helpful, otherwise the Q part of the decomposition is
> useless. One can use that to solve the system by transposing it and
> multiplying with b and solving for x  (Ax = b) where A = R and b = Qt*b
> since the Upper triangular matrix is correctly available (R)
>
> On Fri, Nov 11, 2016 at 3:56 AM Sean Owen <sowen@cloudera.com> wrote:
>
> @Xiangrui / @Joseph, do you think it would be reasonable to have
> CoordinateMatrix sort the rows it creates to make an IndexedRowMatrix? in
> order to make the ultimate output of toRowMatrix less surprising when it's
> not ordered?
>
>
> On Tue, Nov 8, 2016 at 3:29 PM Sean Owen <sowen@cloudera.com> wrote:
>
> I think the problem here is that IndexedRowMatrix.toRowMatrix does *not*
> result in a RowMatrix with rows in order of their indices, necessarily:
>
>
> // Drop its row indices.
> RowMatrix rowMat = indexedRowMatrix.toRowMatrix();
>
> What you get is a matrix where the rows are arranged in whatever order
> they were passed to IndexedRowMatrix. RowMatrix says it's for rows where
> the ordering doesn't matter, but then it's maybe surprising it has a QR
> decomposition method, because clearly the result depends on the order of
> rows in the input. (CC Yuhao Yang for a comment?)
>
> You could say, well, why doesn't IndexedRowMatrix.toRowMatrix return at
> least something with sorted rows? that would not be hard. It also won't
> return "missing" rows (all zeroes), so it would not in any event result in
> a RowMatrix whose implicit rows and ordering represented the same matrix.
> That, at least, strikes me as something to be better documented.
>
> Maybe it would be nicer still to at least sort the rows, given the
> existence of use cases like yours. For example, at least
> CoordinateMatrix.toIndexedRowMatrix could sort? that is less surprising.
>
> In any event you should be able to make it work by manually getting the
> RDD[IndexedRow] out of IndexedRowMatrix, sorting by index, then mapping it
> to Vectors and making a RowMatrix from it.
>
>
>
> On Tue, Nov 8, 2016 at 2:41 PM Iman Mohtashemi <iman.mohtashemi@gmail.com>
> wrote:
>
> Hi Sean,
> Here you go:
>
> sparsematrix.txt =
>
> row, col ,val
> 0,0,.42
> 0,1,.28
> 0,2,.89
> 1,0,.83
> 1,1,.34
> 1,2,.42
> 2,0,.23
> 3,0,.42
> 3,1,.98
> 3,2,.88
> 4,0,.23
> 4,1,.36
> 4,2,.97
>
> The vector is just the third column of the matrix which should give the
> trivial solution of [0,0,1]
>
> This translates to this which is correct
> There are zeros in the matrix (Not really sparse but just an example)
> 0.42  0.28  0.89
> 0.83  0.34  0.42
> 0.23  0.0   0.0
> 0.42  0.98  0.88
> 0.23  0.36  0.97
>
>
> Here is what I get for  the Q and R
>
> Q: -0.21470961288429483  0.23590615093828807   0.6784910613691661
> -0.3920784235278427   -0.06171221388256143  0.5847874866876442
> -0.7748216464954987   -0.4003560542230838   -0.29392323671555354
> -0.3920784235278427   0.8517909521421976    -0.31435038559403217
> -0.21470961288429483  -0.23389547730301666  -0.11165321782745863
> R: -1.0712142642814275  -0.8347536340918976  -1.227672225670157
> 0.0                  0.7662808691141717   0.7553315911660984
> 0.0                  0.0                  0.7785210939368136
>
> When running this in matlab the numbers are the same but row 1 is the last
> row and the last row is interchanged with row 3
>
>
>
> On Mon, Nov 7, 2016 at 11:35 PM Sean Owen <sowen@cloudera.com> wrote:
>
> Rather than post a large section of code, please post a small example of
> the input matrix and its decomposition, to illustrate what you're saying is
> out of order.
>
> On Tue, Nov 8, 2016 at 3:50 AM im281 <iman.mohtashemi@gmail.com> wrote:
>
> I am getting the correct rows but they are out of order. Is this a bug or
> am
> I doing something wrong?
>
>
>
>

Mime
View raw message