spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Musselman <>
Subject Re: Row similarities
Date Sat, 17 Jan 2015 16:29:27 GMT
Thanks Reza, interesting approach.  I think what I actually want is to calculate pair-wise
distance, on second thought.  Is there a pattern for that?

> On Jan 16, 2015, at 9:53 PM, Reza Zadeh <> wrote:
> You can use K-means with a suitably large k. Each cluster should correspond to rows that
are similar to one another.
>> On Fri, Jan 16, 2015 at 5:18 PM, Andrew Musselman <>
>> What's a good way to calculate similarities between all vector-rows in a matrix or
>> I'm seeing RowMatrix has a columnSimilarities method but I'm not sure I'm going down
a good path to transpose a matrix in order to run that.

View raw message