mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Question Regarding Distributed Row Matrix
Date Thu, 05 May 2011 17:21:54 GMT
> though to be frank, I don't understand your second paragraph i.e, how
> turning the vectors into sparse vectors will enable me to do transpose in a
> easier fashion without resorting to doing it manually), however, I suppose
> the purpose of the DRM format was to make step 5,6 much easier so I guess I

What i meant, since you can use sparse vectors, you don't have to
number them strictly sequentially with one reducer. You still might
have several reducers that would number them sequentially within just
single reducer's range but not universally and it still will not be
detrimental from the problem size point of view.

-d

>
>
> Thanks again!
>
>
>
> On Thu, May 5, 2011 at 9:40 AM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
>
>> I think first step is to decide on pipeline of algorithms. Once u know the
>> algorithms u want to run thru, it would be easier to come up with
>> vectorization requirements.
>>
>> That said, for the sake of trasposition, note that mahout supports sparse
>> vectors, I. e. It doesn't matter what the element index is, for as long as
>> it unique, only how many nonzero elements, does. So I don't think that u
>> are
>> per se constrained in number of reducers during vectorization for
>> transpose.
>> That would have been pretty scale restricting, indeed.
>>
>> apologies for brevity.
>>
>> Sent from my android.
>> -Dmitriy
>> On May 5, 2011 6:58 AM, "Vckay" <darkvckay@gmail.com> wrote:
>>
>

Mime
View raw message