mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Minhash key groups
Date Wed, 02 Nov 2011 14:20:57 GMT
What's the Minhash key groups value used for in the MinhashDriver?  I mean, I see it is used
for building up the key out of the hashed values, but what's the significance of different
values for it?  The default is 2, what does it mean practically speaking if I choose, say,
10?  AFAICT, it would mean that I would have more clusters, assuming that we still meet the
minimum cluster size imposed by the reducer?

Thanks,
Grant
Mime
View raw message