spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donni Khan <prince.don...@googlemail.com>
Subject Cosine Similarity between documents - Rows
Date Mon, 27 Nov 2017 12:27:21 GMT
I have spark job to compute the similarity between text documents:

RowMatrix rowMatrix = new RowMatrix(vectorsRDD.rdd());
CoordinateMatrix
rowsimilarity=rowMatrix.columnSimilarities(0.5);JavaRDD<MatrixEntry>
entries = rowsimilarity.entries().toJavaRDD();
List<MatrixEntry> list = entries.collect();
for(MatrixEntry s : list) System.out.println(s);

the MatrixEntry(i, j, value) represents the similarity between
columns(let's say the features of documents).
But how can I show the similarity between rows?
suppose I have five documents Doc1,.... Doc5, We would like to show the
similarity between all those documnts.
 How do I get that? any help?

Thank you
Donni

Mime
View raw message