mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Stewart <>
Subject choosing appropriate t1,t2 for canopy clustering
Date Tue, 15 May 2012 14:45:30 GMT
I am trying to run canopy clustering on vectors extracted from lucene index.  I want to use
CosineDistanceMeasure.  How do I know what appropriate values to use for t1 and t2 distance
threshold?  I would assume that Cosine distance measure would return "distances" as a range
from 0.0 to 1.0 but that seems not the case, so how do I know what the potential distance
ranges are to pick t1 and t2 (other than many trial and errors)?

View raw message