Thanks Sean! Let me take a look!
Iman
On Nov 8, 2016 7:29 AM, "Sean Owen" <sowen@cloudera.com> wrote:
> I think the problem here is that IndexedRowMatrix.toRowMatrix does *not*
> result in a RowMatrix with rows in order of their indices, necessarily:
>
> // Drop its row indices.
> RowMatrix rowMat = indexedRowMatrix.toRowMatrix();
>
> What you get is a matrix where the rows are arranged in whatever order
> they were passed to IndexedRowMatrix. RowMatrix says it's for rows where
> the ordering doesn't matter, but then it's maybe surprising it has a QR
> decomposition method, because clearly the result depends on the order of
> rows in the input. (CC Yuhao Yang for a comment?)
>
> You could say, well, why doesn't IndexedRowMatrix.toRowMatrix return at
> least something with sorted rows? that would not be hard. It also won't
> return "missing" rows (all zeroes), so it would not in any event result in
> a RowMatrix whose implicit rows and ordering represented the same matrix.
> That, at least, strikes me as something to be better documented.
>
> Maybe it would be nicer still to at least sort the rows, given the
> existence of use cases like yours. For example, at least CoordinateMatrix.toIndexedRowMatrix
> could sort? that is less surprising.
>
> In any event you should be able to make it work by manually getting the
> RDD[IndexedRow] out of IndexedRowMatrix, sorting by index, then mapping it
> to Vectors and making a RowMatrix from it.
>
>
>
> On Tue, Nov 8, 2016 at 2:41 PM Iman Mohtashemi <iman.mohtashemi@gmail.com>
> wrote:
>
>> Hi Sean,
>> Here you go:
>>
>> sparsematrix.txt =
>>
>> row, col ,val
>> 0,0,.42
>> 0,1,.28
>> 0,2,.89
>> 1,0,.83
>> 1,1,.34
>> 1,2,.42
>> 2,0,.23
>> 3,0,.42
>> 3,1,.98
>> 3,2,.88
>> 4,0,.23
>> 4,1,.36
>> 4,2,.97
>>
>> The vector is just the third column of the matrix which should give the
>> trivial solution of [0,0,1]
>>
>> This translates to this which is correct
>> There are zeros in the matrix (Not really sparse but just an example)
>> 0.42 0.28 0.89
>> 0.83 0.34 0.42
>> 0.23 0.0 0.0
>> 0.42 0.98 0.88
>> 0.23 0.36 0.97
>>
>>
>> Here is what I get for the Q and R
>>
>> Q: 0.21470961288429483 0.23590615093828807 0.6784910613691661
>> 0.3920784235278427 0.06171221388256143 0.5847874866876442
>> 0.7748216464954987 0.4003560542230838 0.29392323671555354
>> 0.3920784235278427 0.8517909521421976 0.31435038559403217
>> 0.21470961288429483 0.23389547730301666 0.11165321782745863
>> R: 1.0712142642814275 0.8347536340918976 1.227672225670157
>> 0.0 0.7662808691141717 0.7553315911660984
>> 0.0 0.0 0.7785210939368136
>>
>> When running this in matlab the numbers are the same but row 1 is the
>> last row and the last row is interchanged with row 3
>>
>>
>>
>> On Mon, Nov 7, 2016 at 11:35 PM Sean Owen <sowen@cloudera.com> wrote:
>>
>> Rather than post a large section of code, please post a small example of
>> the input matrix and its decomposition, to illustrate what you're saying is
>> out of order.
>>
>> On Tue, Nov 8, 2016 at 3:50 AM im281 <iman.mohtashemi@gmail.com> wrote:
>>
>> I am getting the correct rows but they are out of order. Is this a bug or
>> am
>> I doing something wrong?
>>
>>
>>
