mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject RowSimilarityJob
Date Thu, 31 May 2012 02:22:19 GMT
What is the value created to describe simlarity by RowSimilarityJob? The 
paper which describes how the algorithm is implemented doesn't describe 
the various similarity values returned by mahout. It seems to focus on 
cooccurrences.

For SIMILARITY_COSINE is the value = cosine or 1 - cosine?

Is the value calculated after cooccurrences determines similar docs 
independently?

The code is very difficult to read so a little help would be appreciated.

Mime
View raw message