mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: tf-idf + svd + cosine similarity
Date Wed, 15 Jun 2011 09:27:40 GMT
The features all take on non-negative values here, right?
Then the cosine can't be negative.

In another context, where features could be negative, cosine could
indeed be negative. -1 means most dissimilar of all -- the feature
vectors are exactly opposed.

On Wed, Jun 15, 2011 at 10:10 AM, Stefan Wienert <stefan@wienert.cc> wrote:
> Ignoring is no option... so I have to interpret these values.
> Can one say that documents with similarity = -1 are the less similar
> documents? I don't think this is right.
> Any other assumptions?

Mime
View raw message